2. Read in the data
Load/Install the Required Packages
[1] "patchwork is loaded correctly"
Attaching package: ‘viridis’
The following object is masked from ‘package:viridisLite’:
viridis.map
[1] "viridis is loaded correctly"
[1] "Seurat is loaded correctly"
Loading required package: cowplot
Attaching package: ‘cowplot’
The following object is masked from ‘package:patchwork’:
align_plots
[1] "cowplot is loaded correctly"
Loading required package: gridExtra
[1] "gridExtra is loaded correctly"
Loading required package: grid
[1] "grid is loaded correctly"
Loading required package: Hmisc
Loading required package: lattice
Loading required package: survival
Loading required package: Formula
Loading required package: ggplot2
Attaching package: ‘Hmisc’
The following object is masked from ‘package:Seurat’:
Key
The following objects are masked from ‘package:base’:
format.pval, units
[1] "Hmisc is loaded correctly"
Loading required package: reshape2
[1] "reshape2 is loaded correctly"
Loading required package: dplyr
Attaching package: ‘dplyr’
The following objects are masked from ‘package:Hmisc’:
src, summarize
The following object is masked from ‘package:gridExtra’:
combine
The following objects are masked from ‘package:stats’:
filter, lag
The following objects are masked from ‘package:base’:
intersect, setdiff, setequal, union
[1] "dplyr is loaded correctly"
Loading required package: Nebulosa
[1] "Nebulosa is loaded correctly"
Loading required package: monocle3
Loading required package: Biobase
Loading required package: BiocGenerics
Loading required package: parallel
Attaching package: ‘BiocGenerics’
The following objects are masked from ‘package:parallel’:
clusterApply, clusterApplyLB, clusterCall, clusterEvalQ,
clusterExport, clusterMap, parApply, parCapply, parLapply,
parLapplyLB, parRapply, parSapply, parSapplyLB
The following objects are masked from ‘package:dplyr’:
combine, intersect, setdiff, union
The following object is masked from ‘package:gridExtra’:
combine
The following objects are masked from ‘package:stats’:
IQR, mad, sd, var, xtabs
The following objects are masked from ‘package:base’:
anyDuplicated, append, as.data.frame, basename, cbind, colnames,
dirname, do.call, duplicated, eval, evalq, Filter, Find, get, grep,
grepl, intersect, is.unsorted, lapply, Map, mapply, match, mget,
order, paste, pmax, pmax.int, pmin, pmin.int, Position, rank, rbind,
Reduce, rownames, sapply, setdiff, sort, table, tapply, union,
unique, unsplit, which, which.max, which.min
Welcome to Bioconductor
Vignettes contain introductory material; view with
'browseVignettes()'. To cite Bioconductor, see 'citation("Biobase")',
and for packages 'citation("pkgname")'.
Attaching package: ‘Biobase’
The following object is masked from ‘package:Hmisc’:
contents
Loading required package: SingleCellExperiment
Loading required package: SummarizedExperiment
Loading required package: GenomicRanges
Loading required package: stats4
Loading required package: S4Vectors
Attaching package: ‘S4Vectors’
The following objects are masked from ‘package:dplyr’:
first, rename
The following object is masked from ‘package:base’:
expand.grid
Loading required package: IRanges
Attaching package: ‘IRanges’
The following objects are masked from ‘package:dplyr’:
collapse, desc, slice
Loading required package: GenomeInfoDb
Loading required package: DelayedArray
Loading required package: matrixStats
Attaching package: ‘matrixStats’
The following objects are masked from ‘package:Biobase’:
anyMissing, rowMedians
The following object is masked from ‘package:dplyr’:
count
Attaching package: ‘DelayedArray’
The following objects are masked from ‘package:matrixStats’:
colMaxs, colMins, colRanges, rowMaxs, rowMins, rowRanges
The following objects are masked from ‘package:base’:
aperm, apply, rowsum
Attaching package: ‘SummarizedExperiment’
The following object is masked from ‘package:Seurat’:
Assays
Attaching package: ‘monocle3’
The following objects are masked from ‘package:Biobase’:
exprs, fData, fData<-, pData, pData<-
[1] "monocle3 is loaded correctly"
Read in the Data
screen hits
## EDIT - change this to the excel table once we have it finalized for the screen
screen_hits <- c("PBANKA-0516300",
"PBANKA-1217700",
"PBANKA-0409100",
"PBANKA-1034300",
"PBANKA-1437500",
"PBANKA-0827500",
"PBANKA-0824300",
"PBANKA-1426900",
"PBANKA-0105300",
"PBANKA-0921100",
"PBANKA-1002400",
"PBANKA-0829400",
"PBANKA-1347200",
"PBANKA-0828000",
"PBANKA-0902300",
"PBANKA-1418100",
"PBANKA-1435200",
"PBANKA-1454800",
"PBANKA-0712300",
"PBANKA-0410500",
"PBANKA-1144800",
"PBANKA-1231600",
"PBANKA-0503200",
"PBANKA-0308900",
"PBANKA-1214700",
"PBANKA-0709900",
"PBANKA-0311900",
"PBANKA-0716500",
"PBANKA-1447900",
"PBANKA-0102200",
"PBANKA-0713500",
"PBANKA-0102400",
"PBANKA-1302700",
"PBANKA-1235900",
"PBANKA-0401100",
"PBANKA-0413400",
"PBANKA-1126900",
"PBANKA-1425900",
"PBANKA-0418300",
"PBANKA-1464600",
"PBANKA-0806000")
Read in gene annotations
gene_annotations <- read.table("../data/Reference/GenesByTaxon_Summary.csv", header = TRUE, sep = ",", stringsAsFactors = TRUE)
dim(gene_annotations)
## convert _ to -
gene_annotations$Gene.ID <- gsub("_", "-", gene_annotations$Gene.ID)
load in datasets
paste("10x dataset")
[1] "10x dataset"
pb_sex_filtered
An object of class Seurat
5098 features across 6191 samples within 1 assay
Active assay: RNA (5098 features, 2000 variable features)
2 dimensional reductions calculated: pca, umap
paste("Smart-seq2 dataset")
[1] "Smart-seq2 dataset"
ss2_mutants_final
An object of class Seurat
5245 features across 2717 samples within 1 assay
Active assay: RNA (5245 features, 2000 variable features)
2 dimensional reductions calculated: pca, umap
paste("The composition of the Smart-seq2 dataset is:")
[1] "The composition of the Smart-seq2 dataset is:"
table(ss2_mutants_final@meta.data$genotype)
Mutant WT
2028 689
4. Dimensionality reduction
PCA

Optimised UMAP
After optimisation, the following UMAP can be calculated:
## Run optimised UMAP
tenx.justwt.integrated <- RunUMAP(tenx.justwt.integrated, reduction = "pca", dims = 1:10, n.neighbors = 50, seed.use = 1234, min.dist = 0.4, repulsion.strength = 0.03, local.connectivity = 150)

Make final plots:
## split seurat object up
ob.list <- SplitObject(tenx.justwt.integrated, split.by = "experiment")
## make plots for each object
plot.list <- lapply(X = ob.list, FUN = function(x) {
DimPlot(x, dims = c(1,2), pt.size = 1) + theme(legend.position="bottom")
})
composition_umap_10x <- plot.list$`tenx_5k` +
coord_fixed() +
theme_void() +
scale_color_manual(values=c(replicate(45, "#999999"))) +
labs(title = paste("10x (wild-type)")) +
theme(legend.position = "none", plot.title = element_text(hjust = 0.5, family="Arial", size = 20, face = "bold"))
composition_umap_ss2 <- plot.list$mutants +
coord_fixed() +
theme_void() +
scale_color_manual(values=c(replicate(46, "#999999"))) +
labs(title = paste("Smart-seq2 (wild-type)")) +
theme(legend.position = "none", plot.title = element_text(hjust = 0.5, family="Arial", size = 20, face = "bold"))
composition_umap <- composition_umap_10x + composition_umap_ss2
composition_umap

save
ggsave("../images_to_export/composition_umap.png", plot = composition_umap, device = "png", path = NULL, scale = 1, width = 20, height = 20, units = "cm", dpi = 300, limitsize = TRUE)
## make plots
## hoo dataset correlation
UMAP_hoo <- DimPlot(tenx.justwt.integrated, pt.size = 0.1, group.by = "Prediction.Spearman.", dims = c(1,2)) +
coord_fixed() +
theme_void() +
labs(title = paste("Hoo Predicted Timepoint")) +
theme(plot.title = element_text(hjust = 0.5, family="Arial", size = 20, face = "bold")) +
scale_colour_manual(values = inferno(12)) +
labs(colour = "hour") +
theme(legend.position = "bottom", legend.title=element_text(size=10))
## ap2g timecourse in this paper correlation
UMAP_kasia <- DimPlot(tenx.justwt.integrated, pt.size = 0.1, group.by = "Prediction.Spearman._Kasia", dims = c(1,2)) +
coord_fixed() +
theme_void() +
labs(title = paste("AP2G Timecourse Predicted Timepoint")) +
theme(plot.title = element_text(hjust = 0.5, family="Arial", size = 20, face = "bold")) +
scale_colour_manual(values = inferno(10)) +
labs(colour = "hour") +
theme(legend.position = "bottom", legend.title=element_text(size=10))
## combine
umap_bulk <- wrap_plots(UMAP_hoo, UMAP_kasia, ncol = 2)
## print
umap_bulk

MCA Mapping
Build the index
## load in mca data
counts <- read.csv("../data/Reference/scmap/allpb10x_counts.csv", row.names = 1)
pheno <- read.csv("../data/Reference/scmap/allpb10x_pheno.csv")
## change dash
rownames(counts) <- gsub("_", "-", rownames(counts))
## load MCA pal
mca_colors <- c("6"="#78C679",
"2"="#D1EC9F",
"0"="#FEB24C",
"1"="#F4CF63",
"3"="#FEEEAA",
"4"="#85B1D3",
"7"="#9ecae1",
"5"="#C9E8F1",
"M"= "#B7B7D8",
"F"="#9C96C6",
"unassigned"="darkgrey")
## plot
ggplot(pheno, aes(x=PC2_3d, y = PC3_3d, colour=absclust3)) +
geom_point() +
theme_classic() +
scale_color_manual(values = mca_colors, labels = (c("0 - early troph", "1 - mid troph","2 - late ring", "3 - late troph","4 - early schizont", "5 - late schizont", "6 - early ring","7 - mid schizont", "female", "male", "unassigned")))

###Making an ortholog reference index
## load required libraries
library(scmap)
Creating a generic function for ‘toJSON’ from package ‘jsonlite’ in package ‘googleVis’
library(SingleCellExperiment)
Loading required package: SummarizedExperiment
Loading required package: GenomicRanges
Loading required package: stats4
Loading required package: BiocGenerics
Loading required package: parallel
Attaching package: ‘BiocGenerics’
The following objects are masked from ‘package:parallel’:
clusterApply, clusterApplyLB, clusterCall, clusterEvalQ, clusterExport, clusterMap, parApply,
parCapply, parLapply, parLapplyLB, parRapply, parSapply, parSapplyLB
The following objects are masked from ‘package:stats’:
IQR, mad, sd, var, xtabs
The following objects are masked from ‘package:base’:
anyDuplicated, append, as.data.frame, basename, cbind, colnames, dirname, do.call,
duplicated, eval, evalq, Filter, Find, get, grep, grepl, intersect, is.unsorted, lapply, Map,
mapply, match, mget, order, paste, pmax, pmax.int, pmin, pmin.int, Position, rank, rbind,
Reduce, rownames, sapply, setdiff, sort, table, tapply, union, unique, unsplit, which,
which.max, which.min
Loading required package: S4Vectors
Attaching package: ‘S4Vectors’
The following object is masked from ‘package:base’:
expand.grid
Loading required package: IRanges
Loading required package: GenomeInfoDb
Loading required package: Biobase
Welcome to Bioconductor
Vignettes contain introductory material; view with 'browseVignettes()'. To cite Bioconductor,
see 'citation("Biobase")', and for packages 'citation("pkgname")'.
Loading required package: DelayedArray
Loading required package: matrixStats
Attaching package: ‘matrixStats’
The following objects are masked from ‘package:Biobase’:
anyMissing, rowMedians
Attaching package: ‘DelayedArray’
The following objects are masked from ‘package:matrixStats’:
colMaxs, colMins, colRanges, rowMaxs, rowMins, rowRanges
The following objects are masked from ‘package:base’:
aperm, apply, rowsum
Attaching package: ‘SummarizedExperiment’
The following object is masked from ‘package:Seurat’:
Assays
#prep the SCE, if was originally a Suerat object need the dfs to be regular matrices
#pb_filtered_sce_orth <- pb_filtered_sce_orth[, colData(pb_filtered_sce_orth)$absclust3 != "8"]
#sce <- pb_filtered_sce_orth
#pca <- plotPCA(sce)
#pcs <- pca$data
#table(rownames(pcs)==colnames(sce))
#colData(sce) <- cbind(colData(sce), pcs)
#rowData(sce)$feature_symbol <- rowData(sce)$gene
sce <- SingleCellExperiment(list(counts=counts),
colData=DataFrame(label=pheno),
rowData=DataFrame(feature_symbol=rownames(counts)))
sce
class: SingleCellExperiment
dim: 4890 4884
metadata(0):
assays(1): counts
rownames(4890): PBANKA-0000301 PBANKA-0000600 ... PBANKA-MIT0350 PBANKA-MIT0360
rowData names(1): feature_symbol
colnames(4884): AAACCTGAGCACCGTC AAACCTGAGCGCTTAT ... AGCTTGACAATAGAGT CCGGGATCACGTTGGC
colData names(21): label.X label.sample_id ... label.PC2_3d label.PC3_3d
reducedDimNames(0):
altExpNames(0):
counts_1 <- assay(sce, "counts")
libsizes <- colSums(counts_1)
size.factors <- libsizes/mean(libsizes)
logcounts(sce) <- log2(t(t(counts_1)/size.factors) + 1)
#counts(sce) <- as.matrix(counts(sce))
#logcounts(sce) <- as.matrix(logcounts(sce))
# remove features with duplicated names
sce <- sce[!duplicated(rownames(sce)), ]
#build scmap-cell reference index, save this rds
sce <- selectFeatures(sce, suppress_plot = FALSE, n_features = 500)

table(rowData(sce)$scmap_features)
FALSE TRUE
4390 500
set.seed(1)
sce <- indexCell(sce)
Parameter M was not provided, will use M = n_features / 10 (if n_features <= 1000), where n_features is the number of selected features, and M = 100 otherwise.
Parameter k was not provided, will use k = sqrt(number_of_cells)
names(metadata(sce)$scmap_cell_index)
[1] "subcentroids" "subclusters"
length(metadata(sce)$scmap_cell_index$subcentroids)
[1] 50
dim(metadata(sce)$scmap_cell_index$subcentroids[[1]])
[1] 10 69
metadata(sce)$scmap_cell_index$subcentroids[[1]][,1:5]
1 2 3 4 5
PBANKA-0100021 0.000000000 0.002174527 0.00000000 0.000000000 0.001882238
PBANKA-0100400 0.674777171 0.000000000 0.71839411 0.000000000 0.000000000
PBANKA-0100700 0.000000000 0.000000000 0.06346152 0.000000000 0.003406199
PBANKA-0100900 0.026157737 0.000000000 0.03578261 0.000000000 0.439794449
PBANKA-0102700 0.000000000 0.000000000 0.00000000 0.007797631 0.010978344
PBANKA-0103200 0.000000000 0.000000000 0.00000000 0.000000000 0.000000000
PBANKA-0105100 0.000000000 0.000000000 0.00000000 0.000000000 0.000000000
PBANKA-0107300 0.007199727 0.999997636 0.65054584 0.999969598 0.898011255
PBANKA-0108800 0.734420349 0.000000000 0.23174764 0.000000000 0.004578116
PBANKA-0109900 0.067575567 0.000000000 0.04105225 0.000000000 0.000000000
dim(metadata(sce)$scmap_cell_index$subclusters)
[1] 50 4884
#saveRDS(pb_filtered_sce_orth, file="pb_filtered_sce_orthindex_20181109.rds")
Map to the index
## convert seurat object above into sce for this
pf_field_counts <- as.matrix(tenx.justwt.integrated@assays$RNA@data)
pf_field_pheno <- tenx.justwt.integrated@meta.data
#prep the SCE, if was originally a Suerat object need the dfs to be regular matrices
#pb_filtered_sce_orth <- pb_filtered_sce_orth[, colData(pb_filtered_sce_orth)$absclust3 != "8"]
#sce <- pb_filtered_sce_orth
#pca <- plotPCA(sce)
#pcs <- pca$data
#table(rownames(pcs)==colnames(sce))
#colData(sce) <- cbind(colData(sce), pcs)
#rowData(sce)$feature_symbol <- rowData(sce)$gene
pf_live_field.sce.orth <- SingleCellExperiment(list(counts=pf_field_counts),
colData=DataFrame(label=pf_field_pheno),
rowData=DataFrame(feature_symbol=rownames(pf_field_counts)))
pf_live_field.sce.orth
class: SingleCellExperiment
dim: 5098 6880
metadata(0):
assays(1): counts
rownames(5098): PBANKA-0000101 PBANKA-0000301 ... PBANKA-MIT0360 PBANKA-MIT0370
rowData names(1): feature_symbol
colnames(6880): AAACCTGGTAAGGGCT AAACCTGGTGCACTTA ... SC25027_8_95 SC25027_8_96
colData names(126): label.orig.ident label.nCount_RNA ... label.pt_id_cols
label.seurat_clusters_dot_plotting
reducedDimNames(0):
altExpNames(0):
counts_1 <- assay(pf_live_field.sce.orth, "counts")
libsizes <- colSums(counts_1)
size.factors <- libsizes/mean(libsizes)
logcounts(pf_live_field.sce.orth) <- log2(t(t(counts_1)/size.factors) + 1)
#Project query data set onto cell index
scmapCell_results <- scmapCell(
pf_live_field.sce.orth,
list(mca = metadata(sce)$scmap_cell_index
)
)
##Look into the results
# For each dataset there are two matricies. cells matrix contains the top 10 (scmap default) cell IDs of the cells of the reference dataset that a given cell of the projection dataset is closest to:
#
# Give assignments in two ways:
# 1. Take the top cell assignment abs clust, if cosine similarity is less than 0.4 (or adjust if needed) mark as unassigned
# 2. For the top 3 nearest neighbors, get a mean of the PCA coordinates and snap to the nearest cell of those coordinates. If any of the top three cells are sim below 0.4 then mark as unassigned.
##Top cell assignment method
scmapCell_results$mca$cells[, 1:3]
AAACCTGGTAAGGGCT AAACCTGGTGCACTTA AAACGGGTCGTCCAGG
[1,] 1591 2340 2220
[2,] 2310 2185 2122
[3,] 4699 2242 2350
[4,] 2137 1591 2242
[5,] 2121 2430 1653
[6,] 2092 2108 2457
[7,] 1091 1094 2175
[8,] 4680 2090 2170
[9,] 2226 4122 2365
[10,] 2136 2226 2226
getcells <- scmapCell_results$mca$cells[1, ]
cdsce <- colData(sce)[getcells, ]
topsim <- scmapCell_results$mca$similarities[1, ]
pf_live_field.sce.orth$topcell <- cdsce$label.sample_id
pf_live_field.sce.orth$topcell_ac <- cdsce$label.absclust3
pf_live_field.sce.orth$indexPC1 <- cdsce$label.PC1
pf_live_field.sce.orth$indexPC2 <- cdsce$label.PC2
#pf_live_field.sce.orth$pbpt <- cdsce$pseudotime
#pf_live_field.sce.orth$pbbulk <- cdsce$bulk
pf_live_field.sce.orth$topcell_sp <- pf_live_field.sce.orth$topcell_ac
pf_live_field.sce.orth$topsim <- topsim
pf_live_field.sce.orth$topcell_sp[pf_live_field.sce.orth$topsim < 0.3] <- "unassigned"
table(pf_live_field.sce.orth$topcell_sp)
0 1 2 3 4 5 6 7 F M
1766 534 3478 123 73 71 103 37 353 342
add these metrics back into the object
## add the meta data from scmap to the object
# live
## extract
df_plot_live <- data.frame(topcell_sp = colData(pf_live_field.sce.orth)$topcell_sp, topsim = colData(pf_live_field.sce.orth)$topsim, row.names = rownames(colData(pf_live_field.sce.orth)))
## add to object
tenx.justwt.integrated <- AddMetaData(
object = tenx.justwt.integrated,
metadata = df_plot_live,
col.name = c('mca_topcell_sp', 'mca_topsim')
)
## need to fix the NA in topcell_sp
# Un-factorize (as.numeric can be use for numeric values)
# (as.vector can be use for objects - not tested)
tenx.justwt.integrated@meta.data$mca_topcell_sp <- as.character(tenx.justwt.integrated@meta.data$mca_topcell_sp)
## add unassigned
tenx.justwt.integrated@meta.data$mca_topcell_sp[is.na(tenx.justwt.integrated@meta.data$mca_topcell_sp)] <- "unassigned"
# Re-factorize with the as.factor function or simple factor(fixed$Type)
tenx.justwt.integrated@meta.data$mca_topcell_sp <- as.factor(tenx.justwt.integrated@meta.data$mca_topcell_sp)
## check
table(tenx.justwt.integrated@meta.data$mca_topcell_sp)
0 1 2 3 4 5 6 7 F M
1766 534 3478 123 73 71 103 37 353 342
## reorder levels
tenx.justwt.integrated@meta.data$mca_topcell_sp <- factor(tenx.justwt.integrated@meta.data$mca_topcell_sp, levels = rev(c("M", "F", "5", "7", "4", "3", "1", "0", "2", "6")))
UMAP_mca <- DimPlot(tenx.justwt.integrated, pt.size = 0.1, group.by = "mca_topcell_sp", dims = c(1,2)) +
coord_fixed() +
theme_void() +
labs(title = paste("MCA Predicted Cell Type"), colour = "MCA Cell Type") +
theme(plot.title = element_text(hjust = 0.5, family="Arial", size = 20, face = "bold")) +
scale_color_manual(values = mca_colors, labels = (c("6 - early ring", "2 - late ring", "0 - early troph", "1 - mid troph", "3 - late troph", "4 - early schizont", "7 - mid schizont", "5 - late schizont", "female", "male"))) +
theme(legend.position = "right", legend.title=element_text(size=15))
UMAP_mca

save
ggsave("../images_to_export/UMAP_mca.png", plot = UMAP_mca, device = "png", path = NULL, scale = 1, width = 15, height = 15, units = "cm", dpi = 300, limitsize = TRUE)
8. Plots
make composite pseudotime/ID figure
# 1 = blue - "#0052c5"
# 2 = red - "#a52b1e"
# 3 = green - "#016c00"
# 4 = yellow - "#ffe400"
#pal_sex <- c("#0052c5","#ffe400", "#a52b1e", "#016c00")
## extract pseudotime numbers and identity of cells to a dataframe
df_pt_id <- tenx.justwt.integrated@meta.data[,c("old_pt_values", "monocle_sex")]
## inspect possible values
list_of_sexes <- names(table(df_pt_id$monocle_sex))
## make a new column
df_pt_id$colour <- NA
## make colour ramps
asex_ramp <- colorRampPalette(c("#D5E3F5", "#0052c5"))
male_ramp <- colorRampPalette(c("white", "yellow", "#016c00"))
female_ramp <- colorRampPalette(c("yellow", "#a52b1e"))
bipot_ramp <- colorRampPalette(c("white", "#ffe400"))
## re-classify the cells that are unassigned cells removed from sexual branch above:
#df_pt_id[which(rownames(df_pt_id) %in% remove_cells), ]$monocle_sex <- "Asexual"
## assign values to each cluster
## help here: https://stackoverflow.com/questions/9946630/colour-points-in-a-plot-differently-depending-on-a-vector-of-values
df_pt_id[df_pt_id$monocle_sex == "Asexual_Early" | df_pt_id$monocle_sex == "Asexual_Late", ]$colour <- asex_ramp(100)[as.numeric(cut(df_pt_id[df_pt_id$monocle_sex == "Asexual_Early" | df_pt_id$monocle_sex == "Asexual_Late", ]$old_pt_values,breaks = 100))]
df_pt_id[df_pt_id$monocle_sex == "Male", ]$colour <- male_ramp(100)[as.numeric(cut(df_pt_id[df_pt_id$monocle_sex == "Male", ]$old_pt_values,breaks = 100))]
df_pt_id[df_pt_id$monocle_sex == "Female", ]$colour <- female_ramp(100)[as.numeric(cut(df_pt_id[df_pt_id$monocle_sex == "Female", ]$old_pt_values,breaks = 100))]
df_pt_id[df_pt_id$monocle_sex == "Bipotential", ]$colour <- bipot_ramp(100)[as.numeric(cut(df_pt_id[df_pt_id$monocle_sex == "Bipotential", ]$old_pt_values,breaks = 100))]
## check everything has a value
#table(is.na(df_pt_id$colour))
## make into a df
#df_pt_id <- df_pt_id[ ,"colour", drop = FALSE]
## add back to seurat object
df <- df_pt_id[ ,"colour", drop = FALSE]
tenx.justwt.integrated <- AddMetaData(tenx.justwt.integrated, df, "pt_id_cols")
rm(df)
## plot
## extract UMAP coords
df_umap_plot <- tenx.justwt.integrated@reductions[["umap"]]@cell.embeddings
df_umap_plot <- merge(df_umap_plot, df_pt_id, by=0, all=TRUE)
## add tree
##The tree for monocle is located here:
# monocle.object@principal_graph_aux[["UMAP"]]$dp_mst
ica_space_df <- t(monocle.object.all@principal_graph_aux[["UMAP"]]$dp_mst) %>%
as.data.frame() %>%
dplyr::select_(prin_graph_dim_1 = "UMAP_1", prin_graph_dim_2 = "UMAP_2") %>%
dplyr::mutate(sample_name = rownames(.),
sample_state = rownames(.))
dp_mst <- monocle.object.all@principal_graph[["UMAP"]]
edge_df <- dp_mst %>%
igraph::as_data_frame() %>%
dplyr::select_(source = "from", target = "to") %>%
dplyr::left_join(ica_space_df %>%
dplyr::select_(
source="sample_name",
source_prin_graph_dim_1="prin_graph_dim_1",
source_prin_graph_dim_2="prin_graph_dim_2"),
by = "source") %>%
dplyr::left_join(ica_space_df %>%
dplyr::select_(
target="sample_name",
target_prin_graph_dim_1="prin_graph_dim_1",
target_prin_graph_dim_2="prin_graph_dim_2"),
by = "target")
## make ggplot
umap_id_pt <- ggplot(df_umap_plot, aes(x = UMAP_1, y = UMAP_2)) +
geom_point(col = df_umap_plot$colour) +
theme_void() +
coord_fixed() +
geom_segment(aes_string(x="source_prin_graph_dim_1",
y="source_prin_graph_dim_2",
xend="target_prin_graph_dim_1",
yend="target_prin_graph_dim_2"),
data=edge_df)
umap_id_pt
save
#ggsave("../images_to_export/umap_id_pt.png", plot = umap_id_pt, device = "png", path = NULL, scale = 1, width = 21, height = 29.7, units = "cm", dpi = 300, limitsize = TRUE)
cluster plot
### Prepare data
## extract df with cluster ID,
df_pt_id <- tenx.justwt.integrated@meta.data[ ,"seurat_clusters", drop=FALSE]
## extract UMAP coords
df_plot <- tenx.justwt.integrated@reductions[["umap"]]@cell.embeddings
df_plot <- merge(df_plot, df_pt_id, by=0, all=TRUE)
## add colour col
df_plot$colour <- df_plot$seurat_clusters
levels(df_plot$colour) <- list(print(asex_ramp(17)[1]) = "9", asex_ramp(17)[2]="4",
asex_ramp(17)[3]="15",
asex_ramp(17)[4] = "8",
asex_ramp(17)[5]= "1",
asex_ramp(17)[6]= "14",
asex_ramp(17)[7] = "2",
asex_ramp(17)[8] = "10",
asex_ramp(17)[9] = "3",
asex_ramp(17)[10] = "0",
asex_ramp(17)[11] = "6",
asex_ramp(17)[12] = "5",
asex_ramp(17)[13] = "7",
asex_ramp(17)[14] = "12",
asex_ramp(17)[15] = "18",
asex_ramp(17)[16] = "20",
asex_ramp(17)[17] = "23",
bipot_ramp(3)[2] = "11",
male_ramp(3)[2] = "16",
male_ramp(3)[3] = "13",
female_ramp(4)[2] = "21",
female_ramp(4)[3] = "22",
female_ramp(4)[4] = "19",
female_ramp(4)[4] = "17"
)
## aabbreviate clusters
df_plot$cluster_names <- df_plot$seurat_clusters
levels(df_plot$cluster_names) <- list(A_1="9",
A_2="4",
A_3="15",
A_4 = "8",
A_5 = "1",
A_6= "14",
A_7 = "2",
A_8 = "10",
A_9 = "3",
A_10 = "0",
A_11 = "6",
A_12 = "5",
A_13 = "7",
A_14 = "12",
A_15 = "18",
A_16 = "20",
A_17 = "23",
B = "11",
M_1 = "16",
M_2 = "13",
F_1 = "21",
F_2 = "22",
F_3 = "19",
F_3 = "17"
)
library(dplyr)
dat%>%
group_by(custid)%>%
summarise(Mean=mean(value), Max=max(value), Min=min(value), Median=median(value), Std=sd(value))
## plot
umap_id <- ggplot(df_plot, aes(x = UMAP_1, y = UMAP_2)) +
geom_point(col = df_plot$colour) +
theme_void() +
coord_fixed()
umap_id
ggsave("../images_to_export/umap_with_clusters.png", plot = umap_with_clusters, device = "png", path = NULL, scale = 1, width = 21, height = 29.5, units = "cm", dpi = 300, limitsize = TRUE)
save
ggsave("../images_to_export/umap_with_clusters.png", plot = umap_with_clusters, device = "png", path = NULL, scale = 1, width = 21, height = 29.5, units = "cm", dpi = 300, limitsize = TRUE)
## extract proportion of cells
library(plyr)
df_prop_comb <- as.data.frame(table(tenx.justwt.integrated@meta.data$seurat_clusters_dot_plotting, tenx.justwt.integrated@meta.data$experiment))
names(df_prop_comb) <- c("cluster", "experiment", "Freq")
## calcualte the percentage
df_prop_comb$pc <- NA
df_prop_comb[df_prop_comb$experiment == "mutants", ]$pc <- (df_prop_comb[df_prop_comb$experiment == "mutants", ]$Freq/sum(df_prop_comb[df_prop_comb$experiment == "mutants", ]$Freq))*100
df_prop_comb[df_prop_comb$experiment == "tenx_5k", ]$pc <- (df_prop_comb[df_prop_comb$experiment == "tenx_5k", ]$Freq/sum(df_prop_comb[df_prop_comb$experiment == "tenx_5k", ]$Freq))*100
## reorder levels in Var1
#df_prop_comb$cluster <- factor(df_prop_comb$cluster, levels = c("unassigned", "M", "F", "5", "7", "4", "3", "1", "0", "2", "6"))
## then reorder by this so the cumsum will work below
df_prop_comb <- df_prop_comb[rev(order(df_prop_comb$experiment, df_prop_comb$cluster)),]
## Calculate the cumulative sum of len for each dose
df_cumsum <- ddply(df_prop_comb, "experiment", transform, label_ypos=cumsum(pc) - 0.5*pc)
head(df_cumsum)
# http://www.sthda.com/english/wiki/ggplot2-barplots-quick-start-guide-r-software-and-data-visualization
## reverse the levels in Var1
#df_cumsum$cluster <- factor(df_cumsum$cluster, levels = rev(c(levels(df_cumsum$cluster))))
#df_cumsum <- df_cumsum[c(match(df_cumsum$cluster[1:23], c(levels(df_cumsum$cluster))), match(df_cumsum$cluster[1:23], c(levels(df_cumsum$cluster))) + 23), ]
library(ggrepel)
library(ggpubr)
## make plot
plot_prop <- ggplot(data=df_cumsum, aes(x=experiment, y=pc, fill=cluster)) +
geom_bar(stat="identity")+
geom_label_repel(aes(label = Freq, y=label_ypos, fill = factor(cluster)), color = c(rep(c(rep(c("#ffffff"), 2), "#000000", "#ffffff", rep(c("#000000"), 2), rep(c("#ffffff"), 5), rep(c("#000000"), 12)), 2)), size = 3.5, show.legend = FALSE) +
#geom_text(aes(y=label_ypos, label=Freq), vjust=0, color="black", size=3.5) +
labs(fill = "Cell cluster", y= "Cell Proportions (%)", x = "Technology") +
scale_fill_manual(labels = c("Asexual 1", "Asexual 2" , "Asexual 3", "Asexual 4", "Asexual 5", "Asexual 6", "Asexual 7", "Asexual 8", "Asexual 9", "Asexual 10", "Asexual 11", "Asexual 12", "Asexual 13", "Asexual 14", "Asexual 15", "Asexual 16", "Asexual 17", "Progenitor", "Male 1", "Male 2", "Female 1", "Female 2", "Female 3"), values = pal_plot) +
theme_pubr() +
theme(legend.text=element_text(size=10), legend.position="bottom") +
scale_x_discrete(labels= rev(c("10x", "Smart-seq2"))) +
coord_flip() +
guides(fill=guide_legend(ncol=9))
plot_prop
ggsave("../images_to_export/WT_cell_type_proportions.png", plot = plot_prop, device = "png", path = NULL, scale = 1, width = 30, height = 9, units = "cm", dpi = 300, limitsize = TRUE)
dotplot
A dotplot allows us to look at the expression of multiple genes in a clearer way than succesive UMAP plots
### Data set-up
# PBANKA-0828000 GCSKO-3 GD1
# PBANKA-1302700 GCSKO-oom MD1
# PBANKA-1447900 GCSKO-29 MD2
# PBANKA-0413400 GCSKO-10_820 MD3
# PBANKA-0716500 GCSKO-19 MD4
# PBANKA-0102400 GCSKO-2 MD5
# PBANKA-1454800 GCSKO-21 FD1
# PBANKA-0902300 GCSKO-13 FD2
# PBANKA-1418100 GCSKO-17 FD3
# PBANKA-1435200 GCSKO-20 FD4
# PBANKA-1437500 - AP2G - commitment
# PBANKA-1319500 - CCP2 - female - used in 820 line
# PBANKA-0416100 - MG1 - dynenin heavy chain - male - used in 820 line
# PBANKA-0831000 - MSP1 - late asexual
# PBANKA-1102200 - MSP8 - early asexual (from Bozdech paper)
marker_genes_list <- c("PBANKA-1437500", "PBANKA-1319500", "PBANKA-0416100", "PBANKA-0831000", "PBANKA-1102200")
mutant_genes_list <- c("PBANKA-0828000", "PBANKA-1302700", "PBANKA-1447900", "PBANKA-0413400", "PBANKA-0716500", "PBANKA-0102400", "PBANKA-1454800", "PBANKA-0902300", "PBANKA-1418100", "PBANKA-1435200")
## these get defined later on, but are replicated above here for plotting
asexual_early_clusters <- c(9, 4, 15, 8, 1, 14, 2, 10, 3, 0, 6, 5)
asexual_late_clusters <- c(7, 12, 18, 20, 23)
bipotentional_early_clusters <- # 0?
bipotential_clusters <- c(11)
male_clusters <- c(16, 13)
female_clusters <- c(21, 22, 17, 19)
## copy the clusters so you don't permanently edit the master
tenx.justwt.integrated@meta.data$seurat_clusters_dot_plotting <- tenx.justwt.integrated@meta.data$seurat_clusters
## reorder the levels so you can plot the cluters as you wish
my_levels <- c(asexual_early_clusters,asexual_late_clusters, bipotential_clusters, male_clusters, female_clusters)
## reorder the levels
tenx.justwt.integrated@meta.data$seurat_clusters_dot_plotting <- factor(x = tenx.justwt.integrated@meta.data$seurat_clusters_dot_plotting, levels = my_levels)
## rename clusters so that they are intuitive
# this trick is from here: http://www.cookbook-r.com/Manipulating_data/Renaming_levels_of_a_factor/
levels(tenx.justwt.integrated@meta.data$seurat_clusters_dot_plotting) <- list(Asexual_1="9",
Asexual_2="4",
Asexual_3="15",
Asexual_4 = "8",
Asexual_5 = "1",
Asexual_6= "14",
Asexual_7 = "2",
Asexual_8 = "10",
Asexual_9 = "3",
Asexual_10 = "0",
Asexual_11 = "6",
Asexual_12 = "5",
Asexual_13 = "7",
Asexual_14 = "12",
Asexual_15 = "18",
Asexual_16 = "20",
Asexual_17 = "23",
Progenitor = "11",
Male_1 = "16",
Male_2 = "13",
Female_1 = "21",
Female_2 = "22",
Female_3 = "19",
Female_3 = "17"
)
### Annotation set-up
## extract pseudotime numbers and identity of cells to a dataframe
df_pt_id <- tenx.justwt.integrated@meta.data[,c("old_pt_values", "monocle_sex", "seurat_clusters_dot_plotting", "Prediction.Spearman.", "Prediction.Spearman._Kasia")]
## make a new column
df_pt_id$colour <- NA
## assign bins to each of the values
## help here: https://stackoverflow.com/questions/9946630/colour-points-in-a-plot-differently-depending-on-a-vector-of-values
df_pt_id[df_pt_id$monocle_sex == "Asexual_Early" | df_pt_id$monocle_sex == "Asexual_Late", ]$colour <- as.numeric(cut(df_pt_id[df_pt_id$monocle_sex == "Asexual_Early" | df_pt_id$monocle_sex == "Asexual_Late", ]$old_pt_values,breaks = 100))
df_pt_id[df_pt_id$monocle_sex == "Male", ]$colour <- as.numeric(cut(df_pt_id[df_pt_id$monocle_sex == "Male", ]$old_pt_values,breaks = 100))
df_pt_id[df_pt_id$monocle_sex == "Female", ]$colour <- as.numeric(cut(df_pt_id[df_pt_id$monocle_sex == "Female", ]$old_pt_values,breaks = 100))
df_pt_id[df_pt_id$monocle_sex == "Bipotential", ]$colour <- as.numeric(cut(df_pt_id[df_pt_id$monocle_sex == "Bipotential", ]$old_pt_values,breaks = 100))
## make colour ramps
asex_ramp <- colorRampPalette(c("#D5E3F5", "#0052c5"))
male_ramp <- colorRampPalette(c("white", "yellow", "#016c00"))
female_ramp <- colorRampPalette(c("yellow", "#a52b1e"))
bipot_ramp <- colorRampPalette(c("white", "#ffe400"))
## assign values to each cluster
## take the mean of the bin
df_annotation <- aggregate(df_pt_id[, "colour"], list(df_pt_id$seurat_clusters_dot_plotting), mean)
## BECAUSE we have ordered the clusters already, we can simply take the row index for this bit
df_annotation$colour <- NA
df_annotation[1:17, ]$colour <- asex_ramp(100)[df_annotation[1:17, ]$x]
df_annotation[18, ]$colour <- bipot_ramp(100)[df_annotation[18, ]$x]
df_annotation[19:20, ]$colour <- male_ramp(100)[df_annotation[19:20, ]$x]
df_annotation[21:23, ]$colour <- female_ramp(100)[df_annotation[21:23, ]$x]
## plot annotation
## this really helped: https://www.biostars.org/p/396810/
h2 <- ggplot(df_annotation)+
geom_bar(mapping = aes(x = Group.1, y = 1),
stat = "identity",
fill = df_annotation$colour,
col = "#FFFFFF",
width = 1)+
theme_void()+
theme(panel.spacing.x = unit(1, "mm"))#+
#facet_grid(.~colour, scales = "free_x")
#legend <- plot_grid(get_legend(h2), get_legend(h1), ncol = 1)
h2 <- h2 + theme(legend.position = "none")
# geom_point(col = df_umap_plot$colour)
h1 <- ggplot(df_annotation)+
geom_point(mapping = aes(x = Group.1, y = 1),
col = df_annotation$colour,
size = 5)+
theme_void()+
theme(panel.spacing.x = unit(1, "mm"))#+
#facet_grid(.~colour, scales = "free_x")
#legend <- plot_grid(get_legend(h2), get_legend(h1), ncol = 1)
h1 <- h1 + theme(legend.position = "none")
# geom_point(col = df_umap_plot$colour)
## add predicted time point
## take the mean of the bin
df_annotation <- aggregate(df_pt_id[, "Prediction.Spearman."], list(df_pt_id$seurat_clusters_dot_plotting), mean)
df_annotation$colour <- NA
df_annotation$colour <- viridis(300)[(df_annotation$x)*10]
h3 <- ggplot(df_annotation)+
geom_bar(mapping = aes(x = Group.1, y = 1),
stat = "identity",
fill = df_annotation$colour,
col = "#FFFFFF",
width = 1)+
theme_void()+
theme(panel.spacing.x = unit(1, "mm"))#+
#facet_grid(.~colour, scales = "free_x")
legend_h3 <- get_legend(h3)
h3 <- h3 + theme(legend.position = "none")
h3 <- ggplot(data = df_annotation,
mapping = aes(x = Group.1, fill=x)
) +
geom_bar(col = "#FFFFFF", width = 1)+
theme_void() +
theme(panel.spacing.x = unit(1, "mm")) +
scale_fill_viridis(option = 'cividis')
legend_h3 <- get_legend(h3)
h3 <- h3 + theme(legend.position = "none")
## add kasia data
## take the mean of the bin
df_annotation <- aggregate(df_pt_id[, "Prediction.Spearman._Kasia"], list(df_pt_id$seurat_clusters_dot_plotting), mean)
df_annotation$colour <- NA
df_annotation$colour <- viridis(300)[(df_annotation$x)*10]
h4 <- ggplot(df_annotation)+
geom_bar(mapping = aes(x = Group.1, y = 1),
stat = "identity",
fill = df_annotation$colour,
col = "#FFFFFF",
width = 1)+
theme_void()+
theme(panel.spacing.x = unit(1, "mm"))+
facet_grid(.~colour, scales = "free_x")
legend <- plot_grid(get_legend(h4), ncol = 1)
legend_h4 <- get_legend(h4)
h4 <- h4 + theme(legend.position = "none")
h4 <- ggplot(data = df_annotation,
mapping = aes(x = Group.1, fill=x)
) +
geom_bar(col = "#FFFFFF", width = 1)+
theme_void() +
theme(panel.spacing.x = unit(1, "mm")) +
scale_fill_viridis()
legend_h4 <- get_legend(h4)
h4 <- h4 + theme(legend.position = "none")
### Plot
dot_plot_markers <- DotPlot(tenx.justwt.integrated,
features = c(marker_genes_list, rev(mutant_genes_list)),
group.by = "seurat_clusters_dot_plotting",
dot.min = 0.00001,
assay = 'RNA') +
theme_classic() +
coord_fixed() +
coord_flip() +
# change appearance and remove axis elements, and make room for arrows
theme(axis.text.x = element_text(size=16, angle = 45, hjust=1,vjust=1, family = "Arial"),
axis.text.y = element_text(size=16, face="italic"),
text=element_text(size=16, family="Arial", colour="black"),
legend.position = "bottom",
legend.direction = "horizontal",
legend.box = "vertical",
plot.title = element_blank(),
plot.margin = unit(c(1,3,1,3), "lines"),
axis.line = element_blank(),
panel.border = element_rect(colour = "black", fill=NA, size=0.5)) +
#change the colours
#scale_colour_viridis(option = "inferno", guide = "colourbar", na.value="white", begin = 0, end = 1, direction = 1) +
scale_color_continuous_sequential(palette = "Purples 2") +
## change x axis label
# labs(x = "Marker Genes", y = "Cluster", title = "Expression of Marker Genes by Cluster")
labs(x = "", y = "", title = "") +
## add arrows
#annotate("segment", x = 5.5, xend = 5.5, y = 21.5, yend = 25, colour = "green", size=1, alpha=1, arrow=arrow(length=unit(0.30,"cm"), type = "closed")) +
#annotate("segment", x = 5.5, xend = 5.5, y = 16.5, yend = 21.5, colour = "red", size=1, alpha=1, arrow=arrow(length=unit(0.30,"cm"), type = "closed")) +
#annotate("segment", x = 5.5, xend = 5.5, y = 0, yend = 15.5, colour = "grey", size=1, alpha=1, arrow=arrow(length=unit(0.30,"cm"), type = "closed")) +
## annotate asex
geom_hline(aes(yintercept = (length(c(asexual_early_clusters, asexual_late_clusters))+0.5)), size = 0.5, linetype= 'dashed') +
## annotate bipotential
geom_hline(aes(yintercept = (length(c(asexual_early_clusters, asexual_late_clusters, bipotential_clusters))+0.5)), size = 0.5, linetype= 'dashed') +
## annotate sexes
geom_hline(aes(yintercept = (length(c(asexual_early_clusters, asexual_late_clusters, bipotential_clusters, male_clusters))+0.5)), size = 0.5, linetype= 'dashed') +
## change label on bottom of plot so we can indicate markers
scale_x_discrete(labels = rev(c("gd1", "md1", "md2", "md3", "md4", "md5", "fd1", "fd2", "fd3", "fd4", "msp8", "msp1", "mg1", "ccp2", "ap2g"))) +
## change label on bottom of plot so we can indicate markers
scale_y_discrete(labels = c("Asexual 1", "Asexual 2" , "Asexual 3", "Asexual 4", "Asexual 5", "Asexual 6", "Asexual 7", "Asexual 8", "Asexual 9", "Asexual 10", "Asexual 11", "Asexual 12", "Asexual 13", "Asexual 14", "Asexual 15", "Asexual 16", "Asexual 17", "Progenitor", "Male 1", "Male 2", "Female 1", "Female 2", "Female 3")) +
## change name of legends
guides(col=guide_colorbar(title = 'Scaled Average Expression'),
size=guide_legend("% of cells expressing"))
## view
#print(dot_plot_markers)
plot <- plot_grid(h4, h3, h1, dot_plot_markers, align = "v", ncol = 1, axis = "tb", rel_heights = c(0.5, 0.5, 0.5, 18))
dot_plot <- plot_grid(plot, legend_h3, legend_h4, nrow = 1, rel_widths = c(10, 1.5, 1.5))
dot_plot
save
ggsave("../images_to_export/dot_plot_all.png", plot = dot_plot, device = "png", path = NULL, scale = 1, width = 29.7, height = 21, units = "cm", dpi = 300, limitsize = TRUE)
Make individual plots highlighting where cells in each cluster fall
plot
## this function writes the next bit of code for you
## put it into the console and paste the response
#ploty <- c()
#for(i in seq_along(levels(tenx.justwt.integrated@meta.data$seurat_clusters))){
# ploty <- paste0(ploty, "list_UMAPs_by_cluster[[", i, "]]", " + ")
#}
## plot
composite_plot <- plot_grid(list_UMAPs_by_cluster[[1]], list_UMAPs_by_cluster[[2]], list_UMAPs_by_cluster[[3]], list_UMAPs_by_cluster[[4]], list_UMAPs_by_cluster[[5]], list_UMAPs_by_cluster[[6]], list_UMAPs_by_cluster[[7]], list_UMAPs_by_cluster[[8]], list_UMAPs_by_cluster[[9]], list_UMAPs_by_cluster[[10]], list_UMAPs_by_cluster[[11]], list_UMAPs_by_cluster[[12]], list_UMAPs_by_cluster[[13]], list_UMAPs_by_cluster[[14]], list_UMAPs_by_cluster[[15]], list_UMAPs_by_cluster[[16]], list_UMAPs_by_cluster[[17]], list_UMAPs_by_cluster[[18]], list_UMAPs_by_cluster[[19]], list_UMAPs_by_cluster[[20]], list_UMAPs_by_cluster[[21]], list_UMAPs_by_cluster[[22]], list_UMAPs_by_cluster[[23]], ncol = 8)
composite_plot
save
ggsave("../images_to_export/clusters_composite_plot.png", plot = composite_plot, device = "png", path = NULL, scale = 1, width = 21, height = 10, units = "cm", dpi = 300, limitsize = TRUE)
Expression plots
Expression - raw - Viridis
## find a good ring marker, to see if there is a better one than the ones reported
#markers_ring <- FindMarkers(tenx.justwt.integrated, ident.1 = c("4", "5", "16", "11", "7", "3", "9", "0", "22"))
#head(markers_ring)
# PBANKA-1319500 - CCP2 - female - used in 820 line
# PBANKA-0416100 - MG1 - dynenin heavy chain - male - used in 820 line
# PBANKA-1437500 - AP2G - commitment
# PBANKA-0831000 - MSP1 - late asexual
# PBANKA-1102200 - MSP8 - early asexual (from Bozdech paper)
# PBANKA-0711900 - HSP70 - promoter used for GFP and RFP expression in the mutants
# PBANKA-1400400 - FAMB - ring marker - discovered by looking for marker genes in data
# PBANKA-0722600 - Fam-b2 - ring marker - https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5113031/
marker_gene_plot_CCP2 <- FeaturePlot(tenx.justwt.integrated, features = "PBANKA-1319500", coord.fixed = TRUE,
#min.cutoff = "q1",
dims = c(1,2), reduction = "umap", pt.size = 1, order = TRUE) +
theme_void() +
labs(title = paste(italic(ccp2), "(Female)")) +
theme(plot.title = element_text(hjust = 0.5, family="Arial", size = 20, face = "bold")) +
scale_colour_gradientn(colours=c("#DCDCDC", plasma(30)))
marker_gene_plot_MG1 <- FeaturePlot(tenx.justwt.integrated, features = "PBANKA-0416100", coord.fixed = TRUE,
#min.cutoff = "q1",
dims = c(1,2), reduction = "umap", pt.size = 1, order = TRUE) +
theme_void() +
labs(title = paste(italic(mg1), "(Male)")) +
theme(plot.title = element_text(hjust = 0.5, family="Arial", size = 20, face = "bold")) +
scale_colour_gradientn(colours=c("#DCDCDC", plasma(30)))
marker_gene_plot_AP2G <- FeaturePlot(tenx.justwt.integrated, features = "PBANKA-1437500", coord.fixed = TRUE,
#min.cutoff = "q1",
dims = c(1,2), reduction = "umap", pt.size = 1, order = TRUE) +
theme_void() +
labs(title = paste(italic(ap2g), "(Commitment)")) +
theme(plot.title = element_text(hjust = 0.5, family="Arial", size = 20, face = "bold")) +
scale_colour_gradientn(colours=c("#DCDCDC", plasma(30)))
marker_gene_plot_MSP1 <- FeaturePlot(tenx.justwt.integrated, features = "PBANKA-0831000", coord.fixed = TRUE,
#min.cutoff = "q1",
dims = c(1,2), reduction = "umap", pt.size = 1, order = TRUE) +
theme_void() +
labs(title = paste(italic(msp1), "(Schizont)")) +
theme(plot.title = element_text(hjust = 0.5, family="Arial", size = 20, face = "bold")) +
scale_colour_gradientn(colours=c("#DCDCDC", plasma(30)))
marker_gene_plot_MSP8 <- FeaturePlot(tenx.justwt.integrated, features = "PBANKA-1102200", coord.fixed = TRUE,
#min.cutoff = "q1",
dims = c(1,2), reduction = "umap", pt.size = 1, order = TRUE) +
theme_void() +
labs(title = paste(italic(msp8), "(Asexual)")) +
theme(plot.title = element_text(hjust = 0.5, family="Arial", size = 20, face = "bold")) +
scale_colour_gradientn(colours=c("#DCDCDC", plasma(30)))
marker_gene_plot_SBP1 <- FeaturePlot(tenx.justwt.integrated, features = "PBANKA-1101300", coord.fixed = TRUE,
#min.cutoff = "q1",
dims = c(1,2), reduction = "umap", pt.size = 1, order = TRUE) +
theme_void() +
labs(title = paste(italic(sbp1), "(Ring)")) +
theme(plot.title = element_text(hjust = 0.5, family="Arial", size = 20, face = "bold")) +
scale_colour_gradientn(colours=c("#DCDCDC", plasma(30)))
marker_gene_plot_FAMB <- FeaturePlot(tenx.justwt.integrated, features = "PBANKA-0722600", coord.fixed = TRUE,
#min.cutoff = "q1",
dims = c(1,2), reduction = "umap", pt.size = 1, order = TRUE) +
theme_void() +
labs(title = paste(italic(fam-b2), "(Ring)")) +
theme(plot.title = element_text(hjust = 0.5, family="Arial", size = 20, face = "bold")) +
scale_colour_gradientn(colours=c("#DCDCDC", plasma(30)))
marker_gene_plot_HSP70 <- FeaturePlot(tenx.justwt.integrated, features = "PBANKA-0711900", coord.fixed = TRUE,
#min.cutoff = "q1",
dims = c(1,2), reduction = "umap", pt.size = 1, order = TRUE) +
theme_void() +
labs(title = paste(italic(HSP70), "(Reporter)","\n", "PBANKA_0711900")) +
theme(plot.title = element_text(hjust = 0.5, family="Arial", size = 20, face = "bold")) +
scale_colour_gradientn(colours=c("#DCDCDC", plasma(30)))
##original label:
# labs(title = paste("(CCP2; Female)","\n", "PBANKA_1319500"))
## plot
## cowplot method
marker_gene_plot_all <- plot_grid(marker_gene_plot_FAMB, marker_gene_plot_MSP8, marker_gene_plot_MSP1, marker_gene_plot_AP2G, marker_gene_plot_CCP2, marker_gene_plot_MG1, marker_gene_plot_HSP70, nrow=3)
marker_gene_plot_all
## patchwork method
#marker_gene_plot_FAMB + marker_gene_plot_MSP8 + marker_gene_plot_MSP1 + marker_gene_plot_AP2G + marker_gene_plot_CCP2 + marker_gene_plot_MG1 + marker_gene_plot_HSP70
Expression - raw - purple
marker_gene_ramp <- colorRampPalette(c("#D3D3D3", "#1D1564"))(50)
marker_gene_plot_CCP2 <- FeaturePlot(tenx.justwt.integrated, features = "PBANKA-1319500", coord.fixed = TRUE,
#min.cutoff = "q1",
dims = c(1,2), reduction = "umap", pt.size = 1, order = TRUE) +
theme_void() +
labs(title = paste("CCP2 (Female)")) +
theme(plot.title = element_text(hjust = 0.5, family="Arial", size = 20, face = "bold")) +
scale_colour_gradientn(colours=marker_gene_ramp)
marker_gene_plot_MG1 <- FeaturePlot(tenx.justwt.integrated, features = "PBANKA-0416100", coord.fixed = TRUE,
#min.cutoff = "q1",
dims = c(1,2), reduction = "umap", pt.size = 1, order = TRUE) +
theme_void() +
labs(title = paste("MG1 (Male)")) +
theme(plot.title = element_text(hjust = 0.5, family="Arial", size = 20, face = "bold")) +
scale_colour_gradientn(colours=marker_gene_ramp)
marker_gene_plot_AP2G <- FeaturePlot(tenx.justwt.integrated, features = "PBANKA-1437500", coord.fixed = TRUE,
#min.cutoff = "q1",
dims = c(1,2), reduction = "umap", pt.size = 1, order = TRUE) +
theme_void() +
labs(title = paste("AP2G (Commitment)")) +
theme(plot.title = element_text(hjust = 0.5, family="Arial", size = 20, face = "bold")) +
scale_colour_gradientn(colours=marker_gene_ramp)
marker_gene_plot_MSP1 <- FeaturePlot(tenx.justwt.integrated, features = "PBANKA-0831000", coord.fixed = TRUE,
#min.cutoff = "q1",
dims = c(1,2), reduction = "umap", pt.size = 1, order = TRUE) +
theme_void() +
labs(title = paste("MSP1 (Schizont)")) +
theme(plot.title = element_text(hjust = 0.5, family="Arial", size = 20, face = "bold")) +
scale_colour_gradientn(colours=marker_gene_ramp)
marker_gene_plot_MSP8 <- FeaturePlot(tenx.justwt.integrated, features = "PBANKA-1102200", coord.fixed = TRUE,
#min.cutoff = "q1",
dims = c(1,2), reduction = "umap", pt.size = 1, order = TRUE) +
theme_void() +
labs(title = paste("MSP8 (Asexual)")) +
theme(plot.title = element_text(hjust = 0.5, family="Arial", size = 20, face = "bold")) +
scale_colour_gradientn(colours=marker_gene_ramp)
marker_gene_plot_SBP1 <- FeaturePlot(tenx.justwt.integrated, features = "PBANKA-1101300", coord.fixed = TRUE,
#min.cutoff = "q1",
dims = c(1,2), reduction = "umap", pt.size = 1, order = TRUE) +
theme_void() +
labs(title = paste("SBP1 (Ring)")) +
theme(plot.title = element_text(hjust = 0.5, family="Arial", size = 20, face = "bold")) +
scale_colour_gradientn(colours=marker_gene_ramp)
marker_gene_plot_FAMB <- FeaturePlot(tenx.justwt.integrated, features = "PBANKA-0722600", coord.fixed = TRUE,
#min.cutoff = "q1",
dims = c(1,2), reduction = "umap", pt.size = 1, order = TRUE) +
theme_void() +
labs(title = paste("Fam-b2 (Ring)")) +
theme(plot.title = element_text(hjust = 0.5, family="Arial", size = 20, face = "bold")) +
scale_colour_gradientn(colours=marker_gene_ramp)
marker_gene_plot_HSP70 <- FeaturePlot(tenx.justwt.integrated, features = "PBANKA-0711900", coord.fixed = TRUE,
#min.cutoff = "q1",
dims = c(1,2), reduction = "umap", pt.size = 1, order = TRUE) +
theme_void() +
labs(title = paste("(HSP70; Reporter)","\n", "PBANKA_0711900")) +
theme(plot.title = element_text(hjust = 0.5, family="Arial", size = 20, face = "bold")) +
scale_colour_gradientn(colours=marker_gene_ramp)
## plot
## cowplot method
marker_gene_plot_all <- plot_grid(marker_gene_plot_FAMB, marker_gene_plot_MSP8, marker_gene_plot_MSP1, marker_gene_plot_AP2G, marker_gene_plot_CCP2, marker_gene_plot_MG1, marker_gene_plot_HSP70, nrow=3)
marker_gene_plot_all
Expression - data - purple
# PBANKA-1418100 GCSKO-17 FD3
# PBANKA-0102400 GCSKO-2 MD3
# PBANKA-0716500 GCSKO-19 MD4
# PBANKA-1435200 GCSKO-20 FD4
# PBANKA-0902300 GCSKO-13 FD2
# PBANKA-0413400 GCSKO-10_820 MD5
# PBANKA-0828000 GCSKO-3 GD1
# PBANKA-1302700 GCSKO-oom MD1
# PBANKA-1447900 GCSKO-29 MD2
# PBANKA-1454800 GCSKO-21 FD1
# PBANKA-1144800 GCSKO-28 FD5
marker_gene_plot_17 <- FeaturePlot(tenx.justwt.integrated, dims = c(1,2), reduction = "umap", features = "PBANKA-1418100", coord.fixed = TRUE, pt.size = 1, order = TRUE) +
theme_void() +
labs(title = paste("fd3")) +
theme(plot.title = element_text(hjust = 0.5, family="Arial", size = 20, face = "bold")) +
scale_color_continuous_sequential(palette = "Purples 2") +
## add sex symbols
annotate("text", x = 3.8, y = 1.5, label = male_symbol, size=7, color="gray") +
annotate("text", x = 2, y = 2.8, label = female_symbol, size=7, color="gray")
marker_gene_plot_2 <- FeaturePlot(tenx.justwt.integrated, dims = c(1,2), reduction = "umap", features = "PBANKA-0102400", coord.fixed = TRUE, pt.size = 1, order = TRUE) +
theme_void() +
labs(title = paste("md3")) +
theme(plot.title = element_text(hjust = 0.5, family="Arial", size = 20, face = "bold")) +
scale_color_continuous_sequential(palette = "Purples 2") +
## add sex symbols
annotate("text", x = 3.8, y = 1.5, label = male_symbol, size=7, color="gray") +
annotate("text", x = 2, y = 2.8, label = female_symbol, size=7, color="gray")
marker_gene_plot_19 <- FeaturePlot(tenx.justwt.integrated, dims = c(1,2), reduction = "umap", features = "PBANKA-0716500", coord.fixed = TRUE, pt.size = 1, order = TRUE) +
theme_void() +
labs(title = paste("md4")) +
theme(plot.title = element_text(hjust = 0.5, family="Arial", size = 20, face = "bold")) +
scale_color_continuous_sequential(palette = "Purples 2") +
## add sex symbols
annotate("text", x = 3.8, y = 1.5, label = male_symbol, size=7, color="gray") +
annotate("text", x = 2, y = 2.8, label = female_symbol, size=7, color="gray")
marker_gene_plot_20 <- FeaturePlot(tenx.justwt.integrated, dims = c(1,2), reduction = "umap", features = "PBANKA-1435200", coord.fixed = TRUE, pt.size = 1, order = TRUE) +
theme_void() +
labs(title = paste("fd4")) +
theme(plot.title = element_text(hjust = 0.5, family="Arial", size = 20, face = "bold")) +
scale_color_continuous_sequential(palette = "Purples 2") +
## add sex symbols
annotate("text", x = 3.8, y = 1.5, label = male_symbol, size=7, color="gray") +
annotate("text", x = 2, y = 2.8, label = female_symbol, size=7, color="gray")
marker_gene_plot_13 <- FeaturePlot(tenx.justwt.integrated, dims = c(1,2), reduction = "umap", features = "PBANKA-0902300", coord.fixed = TRUE, pt.size = 1, order = TRUE) +
theme_void() +
labs(title = paste("fd2")) +
theme(plot.title = element_text(hjust = 0.5, family="Arial", size = 20, face = "bold")) +
scale_color_continuous_sequential(palette = "Purples 2") +
## add sex symbols
annotate("text", x = 3.8, y = 1.5, label = male_symbol, size=7, color="gray") +
annotate("text", x = 2, y = 2.8, label = female_symbol, size=7, color="gray")
marker_gene_plot_10 <- FeaturePlot(tenx.justwt.integrated, dims = c(1,2), reduction = "umap", features = "PBANKA-0413400", coord.fixed = TRUE, pt.size = 1, order = TRUE) +
theme_void() +
labs(title = paste("md5")) +
theme(plot.title = element_text(hjust = 0.5, family="Arial", size = 20, face = "bold")) +
scale_color_continuous_sequential(palette = "Purples 2") +
## add sex symbols
annotate("text", x = 3.8, y = 1.5, label = male_symbol, size=7, color="gray") +
annotate("text", x = 2, y = 2.8, label = female_symbol, size=7, color="gray")
marker_gene_plot_3 <- FeaturePlot(tenx.justwt.integrated, dims = c(1,2), reduction = "umap", features = "PBANKA-0828000", coord.fixed = TRUE, pt.size = 1, order = TRUE) +
theme_void() +
labs(title = paste("gd1")) +
theme(plot.title = element_text(hjust = 0.5, family="Arial", size = 20, face = "bold")) +
scale_color_continuous_sequential(palette = "Purples 2") +
## add sex symbols
annotate("text", x = 3.8, y = 1.5, label = male_symbol, size=7, color="gray") +
annotate("text", x = 2, y = 2.8, label = female_symbol, size=7, color="gray")
marker_gene_plot_oom <- FeaturePlot(tenx.justwt.integrated, dims = c(1,2), reduction = "umap", features = "PBANKA-1302700", coord.fixed = TRUE, pt.size = 1, order = TRUE) +
theme_void() +
labs(title = paste("md1")) +
theme(plot.title = element_text(hjust = 0.5, family="Arial", size = 20, face = "bold")) +
scale_color_continuous_sequential(palette = "Purples 2") +
## add sex symbols
annotate("text", x = 3.8, y = 1.5, label = male_symbol, size=7, color="gray") +
annotate("text", x = 2, y = 2.8, label = female_symbol, size=7, color="gray")
marker_gene_plot_29 <- FeaturePlot(tenx.justwt.integrated, dims = c(1,2), reduction = "umap", features = "PBANKA-1447900", coord.fixed = TRUE, pt.size = 1, order = TRUE) +
theme_void() +
labs(title = paste("md2")) +
theme(plot.title = element_text(hjust = 0.5, family="Arial", size = 20, face = "bold")) +
scale_color_continuous_sequential(palette = "Purples 2") +
## add sex symbols
annotate("text", x = 3.8, y = 1.5, label = male_symbol, size=7, color="gray") +
annotate("text", x = 2, y = 2.8, label = female_symbol, size=7, color="gray")
marker_gene_plot_21 <- FeaturePlot(tenx.justwt.integrated, dims = c(1,2), reduction = "umap", features = "PBANKA-1454800", coord.fixed = TRUE, pt.size = 1, order = TRUE) +
theme_void() +
labs(title = paste("fd1")) +
theme(plot.title = element_text(hjust = 0.5, family="Arial", size = 20, face = "bold")) +
scale_color_continuous_sequential(palette = "Purples 2") +
## add sex symbols
annotate("text", x = 3.8, y = 1.5, label = male_symbol, size=7, color="gray") +
annotate("text", x = 2, y = 2.8, label = female_symbol, size=7, color="gray")
##original label:
# labs(title = paste("(CCP2; Female)","\n", "PBANKA_1319500"))
## make composite plot
mutant_expression_composite <- wrap_plots(marker_gene_plot_17 , marker_gene_plot_2 , marker_gene_plot_19 , marker_gene_plot_20 , marker_gene_plot_13 , marker_gene_plot_10 , marker_gene_plot_3 , marker_gene_plot_oom , marker_gene_plot_29 , marker_gene_plot_21 , ncol = 4)
## print
mutant_expression_composite
Density Plots
Density - purple
# PBANKA-1319500 - CCP2 - female - used in 820 line
# PBANKA-0416100 - MG1 - dynenin heavy chain - male - used in 820 line
# PBANKA-1437500 - AP2G - commitment
# PBANKA-0831000 - MSP1 - late asexual
# PBANKA-1102200 - MSP8 - early asexual (from Bozdech paper)
# PBANKA-0711900 - HSP70 - promoter used for GFP and RFP expression in the mutants
# PBANKA-1400400 - FAMB - ring marker - discovered by looking for marker genes in data
# PBANKA-0722600 - Fam-b2 - ring marker - https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5113031/
markers_list <- c("PBANKA-0828000", "PBANKA-1302700", "PBANKA-1447900", "PBANKA-0102400", "PBANKA-0716500", "PBANKA-0413400", "PBANKA-1454800", "PBANKA-0902300", "PBANKA-1418100", "PBANKA-1435200")
list_of_density_plots_mutant_genes <- plot_density(tenx.justwt.integrated, markers_list, joint = FALSE, combine = FALSE, dims = c(1,2), pal = "plasma", method = "ks")
## make composite plot
UMAP_composite_mutant_genes <- wrap_plots(list_of_density_plots_mutant_genes[[1]] +
coord_fixed() +
theme_void() +
labs(title = paste("gd1")) +
scale_colour_gradientn(colours=marker_gene_ramp) +
guides(colour = guide_colourbar(barwidth = 0.5, title = "")),
list_of_density_plots_mutant_genes[[2]] +
coord_fixed() + theme_void() +
labs(title = paste("md1")) +
scale_colour_gradientn(colours=marker_gene_ramp) +
guides(colour = guide_colourbar(barwidth = 0.5, title = "")),
list_of_density_plots_mutant_genes[[3]] + coord_fixed() + theme_void() + labs(title = paste("md2")) + scale_colour_gradientn(colours=marker_gene_ramp) + guides(colour = guide_colourbar(barwidth = 0.5, title = "")),
list_of_density_plots_mutant_genes[[4]] + coord_fixed() + theme_void() + labs(title = paste("md3")) + scale_colour_gradientn(colours=marker_gene_ramp) + guides(colour = guide_colourbar(barwidth = 0.5, title = "")),
list_of_density_plots_mutant_genes[[5]] + coord_fixed() + theme_void() +
labs(title = paste("md4")) + scale_colour_gradientn(colours=marker_gene_ramp) + guides(colour = guide_colourbar(barwidth = 0.5, title = "")),
list_of_density_plots_mutant_genes[[6]] + coord_fixed() + theme_void() + labs(title = paste("md5")) + scale_colour_gradientn(colours=marker_gene_ramp) + guides(colour = guide_colourbar(barwidth = 0.5, title = "")),
list_of_density_plots_mutant_genes[[7]] + coord_fixed() + theme_void() + labs(title = paste("fd1")) + scale_colour_gradientn(colours=marker_gene_ramp) + guides(colour = guide_colourbar(barwidth = 0.5, title = "")),
list_of_density_plots_mutant_genes[[8]] + coord_fixed() + theme_void() + labs(title = paste("fd2")) + scale_colour_gradientn(colours=marker_gene_ramp) + guides(colour = guide_colourbar(barwidth = 0.5, title = "")),
list_of_density_plots_mutant_genes[[9]] + coord_fixed() + theme_void() + labs(title = paste("fd3")) + scale_colour_gradientn(colours=marker_gene_ramp) + guides(colour = guide_colourbar(barwidth = 0.5, title = "")),
list_of_density_plots_mutant_genes[[10]] + coord_fixed() + theme_void() + labs(title = paste("fd4")) + scale_colour_gradientn(colours=marker_gene_ramp) + guides(colour = guide_colourbar(barwidth = 0.5, title = "")),
ncol = 4)
UMAP_composite_mutant_genes
Density -viridis
# PBANKA-1319500 - CCP2 - female - used in 820 line
# PBANKA-0416100 - MG1 - dynenin heavy chain - male - used in 820 line
# PBANKA-1437500 - AP2G - commitment
# PBANKA-0831000 - MSP1 - late asexual
# PBANKA-1102200 - MSP8 - early asexual (from Bozdech paper)
# PBANKA-0711900 - HSP70 - promoter used for GFP and RFP expression in the mutants
# PBANKA-1400400 - FAMB - ring marker - discovered by looking for marker genes in data
# PBANKA-0722600 - Fam-b2 - ring marker - https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5113031/
markers_list <- c("PBANKA-0828000", "PBANKA-1302700", "PBANKA-1447900", "PBANKA-0102400", "PBANKA-0716500", "PBANKA-0413400", "PBANKA-1454800", "PBANKA-0902300", "PBANKA-1418100", "PBANKA-1435200")
list_of_density_plots_mutant_genes <- plot_density(tenx.justwt.integrated, markers_list, joint = FALSE, combine = FALSE, dims = c(1,2), pal = "plasma", method = "ks")
## make composite plot
UMAP_composite_mutant_genes <- wrap_plots(list_of_density_plots_mutant_genes[[1]] +
coord_fixed() +
theme_void() +
labs(title = paste("gd1")) +
scale_colour_gradientn(colours=c("#DCDCDC", plasma(30))) +
guides(colour = guide_colourbar(barwidth = 0.5, title = "")),
list_of_density_plots_mutant_genes[[2]] +
coord_fixed() + theme_void() +
labs(title = paste("md1")) +
scale_colour_gradientn(colours=c("#DCDCDC", plasma(30))) +
guides(colour = guide_colourbar(barwidth = 0.5, title = "")),
list_of_density_plots_mutant_genes[[3]] + coord_fixed() + theme_void() + labs(title = paste("md2")) + scale_colour_gradientn(colours=c("#DCDCDC", plasma(30))) + guides(colour = guide_colourbar(barwidth = 0.5, title = "")),
list_of_density_plots_mutant_genes[[4]] + coord_fixed() + theme_void() + labs(title = paste("md3")) + scale_colour_gradientn(colours=c("#DCDCDC", plasma(30)) )+ guides(colour = guide_colourbar(barwidth = 0.5, title = "")),
list_of_density_plots_mutant_genes[[5]] + coord_fixed() + theme_void() +
labs(title = paste("md4")) + scale_colour_gradientn(colours=c("#DCDCDC", plasma(30))) + guides(colour = guide_colourbar(barwidth = 0.5, title = "")),
list_of_density_plots_mutant_genes[[6]] + coord_fixed() + theme_void() + labs(title = paste("md5")) + scale_colour_gradientn(colours=c("#DCDCDC", plasma(30))) + guides(colour = guide_colourbar(barwidth = 0.5, title = "")),
list_of_density_plots_mutant_genes[[7]] + coord_fixed() + theme_void() + labs(title = paste("fd1")) + scale_colour_gradientn(colours=c("#DCDCDC", plasma(30))) + guides(colour = guide_colourbar(barwidth = 0.5, title = "")),
list_of_density_plots_mutant_genes[[8]] + coord_fixed() + theme_void() + labs(title = paste("fd2")) + scale_colour_gradientn(colours=c("#DCDCDC", plasma(30))) + guides(colour = guide_colourbar(barwidth = 0.5, title = "")),
list_of_density_plots_mutant_genes[[9]] + coord_fixed() + theme_void() + labs(title = paste("fd3")) + scale_colour_gradientn(colours=c("#DCDCDC", plasma(30))) + guides(colour = guide_colourbar(barwidth = 0.5, title = "")),
list_of_density_plots_mutant_genes[[10]] + coord_fixed() + theme_void() + labs(title = paste("fd4")) + scale_colour_gradientn(colours=c("#DCDCDC", plasma(30))) + guides(colour = guide_colourbar(barwidth = 0.5, title = "")),
ncol = 4)
UMAP_composite_mutant_genes
Density - viridis
# PBANKA-0828000 GCSKO-3 GD1
# PBANKA-1302700 GCSKO-oom MD1
# PBANKA-1447900 GCSKO-29 MD2
# PBANKA-0102400 GCSKO-2 MD3
# PBANKA-0716500 GCSKO-19 MD4
# PBANKA-0413400 GCSKO-10_820 MD5
# PBANKA-1454800 GCSKO-21 FD1
# PBANKA-0902300 GCSKO-13 FD2
# PBANKA-1418100 GCSKO-17 FD3
# PBANKA-1435200 GCSKO-20 FD4
markers_list <- c("PBANKA-0828000", "PBANKA-1302700", "PBANKA-1447900", "PBANKA-0102400", "PBANKA-0716500", "PBANKA-0413400", "PBANKA-1454800", "PBANKA-0902300", "PBANKA-1418100", "PBANKA-1435200")
list_of_density_plots_mutant_genes <- plot_density(tenx.justwt.integrated, markers_list, joint = FALSE, combine = FALSE, dims = c(1,2), pal = "plasma", method = "ks")
## make composite plot
UMAP_composite_mutant_genes <- wrap_plots(list_of_density_plots_mutant_genes[[1]] +
coord_fixed() +
theme_void() +
labs(title = paste("gd1")) +
scale_colour_gradientn(colours=c("#DCDCDC", plasma(30))) +
guides(colour = guide_colourbar(barwidth = 0.5, title = "")),
list_of_density_plots_mutant_genes[[2]] +
coord_fixed() + theme_void() +
labs(title = paste("md1")) +
scale_colour_gradientn(colours=c("#DCDCDC", plasma(30))) +
guides(colour = guide_colourbar(barwidth = 0.5, title = "")),
list_of_density_plots_mutant_genes[[3]] + coord_fixed() + theme_void() + labs(title = paste("md2")) + scale_colour_gradientn(colours=c("#DCDCDC", plasma(30))) + guides(colour = guide_colourbar(barwidth = 0.5, title = "")),
list_of_density_plots_mutant_genes[[4]] + coord_fixed() + theme_void() + labs(title = paste("md3")) + scale_colour_gradientn(colours=c("#DCDCDC", plasma(30)) )+ guides(colour = guide_colourbar(barwidth = 0.5, title = "")),
list_of_density_plots_mutant_genes[[5]] + coord_fixed() + theme_void() +
labs(title = paste("md4")) + scale_colour_gradientn(colours=c("#DCDCDC", plasma(30))) + guides(colour = guide_colourbar(barwidth = 0.5, title = "")),
list_of_density_plots_mutant_genes[[6]] + coord_fixed() + theme_void() + labs(title = paste("md5")) + scale_colour_gradientn(colours=c("#DCDCDC", plasma(30))) + guides(colour = guide_colourbar(barwidth = 0.5, title = "")),
list_of_density_plots_mutant_genes[[7]] + coord_fixed() + theme_void() + labs(title = paste("fd1")) + scale_colour_gradientn(colours=c("#DCDCDC", plasma(30))) + guides(colour = guide_colourbar(barwidth = 0.5, title = "")),
list_of_density_plots_mutant_genes[[8]] + coord_fixed() + theme_void() + labs(title = paste("fd2")) + scale_colour_gradientn(colours=c("#DCDCDC", plasma(30))) + guides(colour = guide_colourbar(barwidth = 0.5, title = "")),
list_of_density_plots_mutant_genes[[9]] + coord_fixed() + theme_void() + labs(title = paste("fd3")) + scale_colour_gradientn(colours=c("#DCDCDC", plasma(30))) + guides(colour = guide_colourbar(barwidth = 0.5, title = "")),
list_of_density_plots_mutant_genes[[10]] + coord_fixed() + theme_void() + labs(title = paste("fd4")) + scale_colour_gradientn(colours=c("#DCDCDC", plasma(30))) + guides(colour = guide_colourbar(barwidth = 0.5, title = "")),
ncol = 4)
UMAP_composite_mutant_genes
## IF putting in MAIN - list_of_density_plots_mutant_genes[[2]] + coord_fixed() + theme_void() + theme(plot.title = element_text(hjust = 0.5), legend.text = element_text(size = 8), text=element_text(size=9)) + labs(title = paste("md1")) + scale_colour_gradientn(colours=c("#DCDCDC", plasma(30))) + guides(colour = guide_colourbar(barwidth = 0.3, barheight = 4.0, title = ""))
# PBANKA-0828000 GCSKO-3 GD1
# PBANKA-1302700 GCSKO-oom MD1
# PBANKA-1447900 GCSKO-29 MD2
# PBANKA-0102400 GCSKO-2 MD3
# PBANKA-0716500 GCSKO-19 MD4
# PBANKA-0413400 GCSKO-10_820 MD5
# PBANKA-1454800 GCSKO-21 FD1
# PBANKA-0902300 GCSKO-13 FD2
# PBANKA-1418100 GCSKO-17 FD3
# PBANKA-1435200 GCSKO-20 FD4
markers_list <- c("PBANKA-0828000", "PBANKA-1302700", "PBANKA-1447900", "PBANKA-0102400", "PBANKA-0716500", "PBANKA-0413400", "PBANKA-1454800", "PBANKA-0902300", "PBANKA-1418100", "PBANKA-1435200")
list_of_density_plots_mutant_genes <- plot_density(tenx.justwt.integrated, markers_list, joint = FALSE, combine = FALSE, dims = c(1,2), pal = "plasma", method = "wkde", slot = 'data')
## make composite plot
UMAP_composite_mutant_genes <- wrap_plots(list_of_density_plots_mutant_genes[[1]] +
coord_fixed() +
theme_void() +
labs(title = paste("gd1")) +
scale_color_continuous_sequential(palette = "Purples 2") +
guides(colour = guide_colourbar(barwidth = 0.5, title = "")),
list_of_density_plots_mutant_genes[[2]] +
coord_fixed() + theme_void() +
labs(title = paste("md1")) +
scale_color_continuous_sequential(palette = "Purples 2") +
guides(colour = guide_colourbar(barwidth = 0.5, title = "")),
list_of_density_plots_mutant_genes[[3]] +
coord_fixed() +
theme_void() +
labs(title = paste("md2")) + scale_color_continuous_sequential(palette = "Purples 2") +
guides(colour = guide_colourbar(barwidth = 0.5, title = "")),
list_of_density_plots_mutant_genes[[4]] + coord_fixed() + theme_void() + labs(title = paste("md3")) + scale_color_continuous_sequential(palette = "Purples 2") + guides(colour = guide_colourbar(barwidth = 0.5, title = "")),
list_of_density_plots_mutant_genes[[5]] + coord_fixed() + theme_void() +
labs(title = paste("md4")) + scale_color_continuous_sequential(palette = "Purples 2") + guides(colour = guide_colourbar(barwidth = 0.5, title = "")),
list_of_density_plots_mutant_genes[[6]] + coord_fixed() + theme_void() + labs(title = paste("md5")) + scale_color_continuous_sequential(palette = "Purples 2") + guides(colour = guide_colourbar(barwidth = 0.5, title = "")),
list_of_density_plots_mutant_genes[[7]] + coord_fixed() + theme_void() + labs(title = paste("fd1")) + scale_color_continuous_sequential(palette = "Purples 2") + guides(colour = guide_colourbar(barwidth = 0.5, title = "")),
list_of_density_plots_mutant_genes[[8]] + coord_fixed() + theme_void() + labs(title = paste("fd2")) + scale_color_continuous_sequential(palette = "Purples 2") + guides(colour = guide_colourbar(barwidth = 0.5, title = "")),
list_of_density_plots_mutant_genes[[9]] + coord_fixed() + theme_void() + labs(title = paste("fd3")) + scale_color_continuous_sequential(palette = "Purples 2") + guides(colour = guide_colourbar(barwidth = 0.5, title = "")),
list_of_density_plots_mutant_genes[[10]] + coord_fixed() + theme_void() + labs(title = paste("fd4")) + scale_color_continuous_sequential(palette = "Purples 2") + guides(colour = guide_colourbar(barwidth = 0.5, title = "")),
ncol = 4)
## old colour scale: scale_colour_gradientn(colours=marker_gene_ramp )
UMAP_composite_mutant_genes
---
subtitle: 'Gametocyte Development in <i>Plasmodium berghei</i>'
title: |
  ![](../GCSKO_logo.jpg){width=300px}  
  Merging Smart-seq2 and 10X Datasets - wild-type only
author: "[Andrew Russell](https://ajcrussell.wixsite.com/mysite/about)"
institute: Wellcome Sanger Institute
date: '`r format(Sys.Date(), "%B %d, %Y")`'
output:
  html_notebook:
    theme: cosmo
    toc: yes
    toc_depth: 3
    #toc_float: yes
    df_print: paged
---
***
# 1. Introduction and Aims {.tabset}

We have quality-controlled the 10X data and the SS2 data and now are left with the following objects:

10X 5K data - pb_sex_filtered

10X 30K data - pb_30k_sex_filtered 

SS2 mutant data - ss2_mutants_final

# 2. Read in the data  {.tabset}

### Load/Install the Required Packages

```{r load packages, echo = FALSE}
## CRAN packages

## Pathwork is needed to stich plots together using '+'
if(require("patchwork", quietly = TRUE)){
    print("patchwork is loaded correctly")
} else {
    print("trying to install patchwork")
    install.packages("patchwork")
    if(require(patchwork)){
        print("patchwork installed and loaded")
    } else {
        stop("could not install patchwork")
    }
}

## viridis allows different colours to be added to plots
if(require("viridis", quietly = TRUE)){
    print("viridis is loaded correctly")
} else {
    print("trying to install viridis")
    install.packages("viridis")
    if(require(viridis)){
        print("viridis installed and loaded")
    } else {
        stop("could not install viridis")
    }
}

## Seurat is needed for most of this script
if(require("Seurat", quietly = TRUE)){
    print("Seurat is loaded correctly")
} else {
    print("trying to install Seurat")
    install.packages("Seurat")
    if(require(Seurat)){
        print("Seurat installed and loaded")
    } else {
        stop("could not install Seurat")
    }
}

## cowplot is needed for plots in this script
if(require("cowplot")){
    print("cowplot is loaded correctly")
} else {
    print("trying to install cowplot")
    install.packages("cowplot")
    if(require(cowplot)){
        print("cowplot installed and loaded")
    } else {
        stop("could not install cowplot")
    }
}

## gridExtra is needed for grid graphics to plot multiple plots in the same view
if(require("gridExtra")){
    print("gridExtra is loaded correctly")
} else {
    print("trying to install gridExtra")
    install.packages("gridExtra")
    if(require(gridExtra)){
        print("gridExtra installed and loaded")
    } else {
        stop("could not install gridExtra")
    }
}

## grid is needed for grid.arrange function to change size of title
if(require("grid")){
    print("grid is loaded correctly")
} else {
    print("trying to install grid")
    install.packages("grid")
    if(require(grid)){
        print("grid installed and loaded")
    } else {
        stop("could not install grid")
    }
}

##for doing bulk correlation calculations
if(require("Hmisc")){
    print("Hmisc is loaded correctly")
} else {
    print("trying to install Hmisc")
    install.packages("Hmisc")
    if(require(Hmisc)){
        print("Hmisc installed and loaded")
    } else {
        stop("could not install Hmisc")
    }
}

## reshape2 to melt dataframes for plotting:
if(require("reshape2")){
    print("reshape2 is loaded correctly")
} else {
    print("trying to install reshape2")
    install.packages("reshape2")
    if(require(reshape2)){
        print("reshape2 installed and loaded")
    } else {
        stop("could not install reshape2")
    }
}

## to work with data frames:
if(require("dplyr")){
    print("dplyr is loaded correctly")
} else {
    print("trying to install dplyr")
    install.packages("dplyr")
    if(require(dplyr)){
        print("dplyr installed and loaded")
    } else {
        stop("could not install dplyr")
    }
}

## non-CRAN packages

## to make density plots showing gene expression
if(require("Nebulosa")){
    print("Nebulosa is loaded correctly")
} else {
    print("trying to install Nebulosa")
    devtools::install_github("powellgenomicslab/Nebulosa")
    if(require(Nebulosa)){
        print("Nebulosa installed and loaded")
    } else {
        stop("could not install Nebulosa")
    }
}

## monocle3 to calculate pseudotime:
if(require("monocle3")){
    print("monocle3 is loaded correctly")
} else {
    print("Please install monocle3 (https://cole-trapnell-lab.github.io/monocle3/docs/installation/)")
}

## set the seed for both the mixture models and also for the sample function later on:
set.seed(-92497)
```

### Read in the Data

screen hits
```{r}
## EDIT - change this to the excel table once we have it finalized for the screen
screen_hits <- c("PBANKA-0516300",
"PBANKA-1217700",
"PBANKA-0409100",
"PBANKA-1034300",
"PBANKA-1437500",
"PBANKA-0827500",
"PBANKA-0824300",
"PBANKA-1426900",
"PBANKA-0105300",
"PBANKA-0921100",
"PBANKA-1002400",
"PBANKA-0829400",
"PBANKA-1347200",
"PBANKA-0828000",
"PBANKA-0902300",
"PBANKA-1418100",
"PBANKA-1435200",
"PBANKA-1454800",
"PBANKA-0712300",
"PBANKA-0410500",
"PBANKA-1144800",
"PBANKA-1231600",
"PBANKA-0503200",
"PBANKA-0308900",
"PBANKA-1214700",
"PBANKA-0709900",
"PBANKA-0311900",
"PBANKA-0716500",
"PBANKA-1447900",
"PBANKA-0102200",
"PBANKA-0713500",
"PBANKA-0102400",
"PBANKA-1302700",
"PBANKA-1235900",
"PBANKA-0401100",
"PBANKA-0413400",
"PBANKA-1126900",
"PBANKA-1425900",
"PBANKA-0418300",
"PBANKA-1464600",
"PBANKA-0806000")
```

Read in gene annotations
```{r}
gene_annotations <- read.table("../data/Reference/GenesByTaxon_Summary.csv", header = TRUE, sep = ",", stringsAsFactors = TRUE)
dim(gene_annotations)

## convert _ to -
gene_annotations$Gene.ID <- gsub("_", "-", gene_annotations$Gene.ID)
```

load in datasets
```{r}
## load the 10X dataset
pb_sex_filtered <- readRDS("../data_to_export/pb_sex_filtered.RDS")
## load the SS2 dataset
ss2_mutants_final <- readRDS("../data_to_export/ss2_mutants_final.RDS")

## inspect
paste("10x dataset")
pb_sex_filtered
paste("Smart-seq2 dataset")
ss2_mutants_final
paste("The composition of the Smart-seq2 dataset is:")
table(ss2_mutants_final@meta.data$genotype)
```

# 3. Merging the Smart-seq2 and 10X Data {.tabset}

### Prepare data

```{r integration 10x setup}
## extract 10x data
tenx_5k_counts <- as.matrix(pb_sex_filtered@assays$RNA@counts)
tenx_5k_pheno <- pb_sex_filtered@meta.data

## Create fresh object
tenx_5k_counts_to_integrate <- CreateSeuratObject(counts = tenx_5k_counts, meta.data = tenx_5k_pheno, min.cells = 0, min.features = 0, project = "GCSKO")

## add experiment meta data
tenx_5k_counts_to_integrate@meta.data$experiment <- "tenx_5k"

## inspect
tenx_5k_counts_to_integrate
```

We need to make sure the mutant data is compatible with the 10X data. the 10X data has fewer genes represented so we need to find the intersect of the two before integration.
```{r integration ss2 setup}
## extract SS2 data 
mutant_counts_for_integration <- as.matrix(ss2_mutants_final@assays$RNA@counts)
mutant_pheno_for_integration <- ss2_mutants_final@meta.data

## change counts so the :rRNA and :tRNA are not there:
rownames(mutant_counts_for_integration) <- gsub(":ncRNA", "", gsub(":rRNA", "", gsub(":tRNA", "", rownames(mutant_counts_for_integration))))

## change the gene names so that they are - rather than _:
rownames(mutant_counts_for_integration) <- gsub("_", "-", rownames(mutant_counts_for_integration))

## calculate how many of the genes overlap - 10x does start out with 5098 vs 5245
genes_in_tenx_dataset <- intersect(rownames(tenx_5k_counts), rownames(mutant_counts_for_integration))
## print number of genes that overlap
dim(mutant_counts_for_integration)
## subset the mutant counts to contain only 10x genes
mutant_counts_for_integration <- mutant_counts_for_integration[which(rownames(mutant_counts_for_integration) %in% genes_in_tenx_dataset), ]
## print result of genes that overlap
dim(mutant_counts_for_integration)

## make Seurat object:
GCSKO_mutants <- CreateSeuratObject(counts = mutant_counts_for_integration, meta.data = mutant_pheno_for_integration, min.cells = 0, min.features = 0, project = "GCSKO")

## add experiment meta data
GCSKO_mutants@meta.data$experiment <- "mutants"

## inspect
GCSKO_mutants
```

```{r}
## double check that this is the same number of genes
## subset counts so that only genes represented in the other two objects are there:
length(intersect(rownames(tenx_5k_counts), rownames(mutant_counts_for_integration)))
```

IMPORTANT - this next step is different to GCSKO_merge as it subsets the smart-seq2 data into wild-type only. 
```{r}
## subset wt on ss2 data:
ss2_wt_cells <- rownames(GCSKO_mutants@meta.data[GCSKO_mutants@meta.data$genotype == "WT", ])
GCSKO_mutants_wtonly <- subset(GCSKO_mutants, cells = ss2_wt_cells)
```

create list and normalise:
```{r integration normalise}
## make list
tenx.justwt.list <- list(tenx_5k_counts_to_integrate, GCSKO_mutants_wtonly)

## prepare data
for (i in 1:length(tenx.justwt.list)) {
    tenx.justwt.list[[i]] <- NormalizeData(tenx.justwt.list[[i]], verbose = FALSE)
    tenx.justwt.list[[i]] <- FindVariableFeatures(tenx.justwt.list[[i]], selection.method = "vst", 
        nfeatures = 2000, verbose = FALSE)
    all.genes <- rownames(tenx.justwt.list[[i]])
    tenx.justwt.list[[i]] <- ScaleData(tenx.justwt.list[[i]], features = all.genes)
}
```

### Integrate objects

```{r integration}
## Find anchors
tenx.justwt.anchors <- FindIntegrationAnchors(object.list = tenx.justwt.list, dims = 1:21, verbose = FALSE)

## Integrate data
tenx.justwt.integrated <- IntegrateData(anchorset = tenx.justwt.anchors, dims = 1:21, verbose = FALSE, features.to.integrate = genes_in_tenx_dataset)
```

# 4. Dimensionality reduction {.tabset}

### PCA
```{r, fig.width = 10, fig.height = 5}
## Make the default assay integrated
DefaultAssay(tenx.justwt.integrated) <- "integrated"

## Run the standard workflow for visualization and clustering
tenx.justwt.integrated <- ScaleData(tenx.justwt.integrated, verbose = FALSE)
tenx.justwt.integrated <- RunPCA(tenx.justwt.integrated, npcs = 30, verbose = FALSE)

## inspect PCs
ElbowPlot(tenx.justwt.integrated, ndims = 30, reduction = "pca")
```

### Optimised UMAP
After optimisation, the following UMAP can be calculated:
```{r umap run 2, fig.height = 7, fig.width = 7}
## Run optimised UMAP
tenx.justwt.integrated <- RunUMAP(tenx.justwt.integrated, reduction = "pca", dims = 1:10, n.neighbors = 50, seed.use = 1234, min.dist = 0.4, repulsion.strength = 0.03, local.connectivity = 150)
```

```{r umap visualise 2}
## plot
dp1 <- DimPlot(tenx.justwt.integrated, label = TRUE, repel = FALSE, pt.size = 0.05, dims = c(1,2), split.by = "experiment") + 
  ## fix the axis
  coord_fixed() #+ 
  ## reverse the scale
  #scale_y_reverse()

## view
dp1
```

Make final plots:
```{r, fig.height = 10, fig.width = 10}

## split seurat object up
ob.list <- SplitObject(tenx.justwt.integrated, split.by = "experiment")

## make plots for each object
plot.list <- lapply(X = ob.list, FUN = function(x) {
    DimPlot(x, dims = c(1,2), pt.size = 1) + theme(legend.position="bottom")
})

composition_umap_10x <- plot.list$`tenx_5k` + 
  coord_fixed() +
  theme_void() +
  scale_color_manual(values=c(replicate(45, "#999999"))) +
  labs(title = paste("10x (wild-type)")) +
  theme(legend.position = "none", plot.title = element_text(hjust = 0.5, family="Arial", size = 20, face = "bold"))

composition_umap_ss2 <- plot.list$mutants +
  coord_fixed() +
  theme_void() +
  scale_color_manual(values=c(replicate(46, "#999999"))) +
  labs(title = paste("Smart-seq2 (wild-type)")) +
  theme(legend.position = "none", plot.title = element_text(hjust = 0.5, family="Arial", size = 20, face = "bold"))

composition_umap <- composition_umap_10x + composition_umap_ss2 

composition_umap
```

save
```{r}
ggsave("../images_to_export/composition_umap.png", plot = composition_umap, device = "png", path = NULL, scale = 1, width = 20, height = 20, units = "cm", dpi = 300, limitsize = TRUE)
```

```{r}
## make plots
## hoo dataset correlation
UMAP_hoo <- DimPlot(tenx.justwt.integrated, pt.size = 0.1, group.by = "Prediction.Spearman.", dims = c(1,2)) +
  coord_fixed() + 
  theme_void() +
  labs(title = paste("Hoo Predicted Timepoint")) + 
  theme(plot.title = element_text(hjust = 0.5, family="Arial", size = 20, face = "bold")) +
  scale_colour_manual(values = inferno(12))  +
  labs(colour = "hour") +
  theme(legend.position = "bottom", legend.title=element_text(size=10))

## ap2g timecourse in this paper correlation
UMAP_kasia <- DimPlot(tenx.justwt.integrated, pt.size = 0.1, group.by = "Prediction.Spearman._Kasia", dims = c(1,2)) +
  coord_fixed() + 
  theme_void() +
  labs(title = paste("AP2G Timecourse Predicted Timepoint")) + 
  theme(plot.title = element_text(hjust = 0.5, family="Arial", size = 20, face = "bold")) +
  scale_colour_manual(values = inferno(10))  +
  labs(colour = "hour") +
  theme(legend.position = "bottom", legend.title=element_text(size=10))

## combine
umap_bulk <- wrap_plots(UMAP_hoo, UMAP_kasia, ncol = 2)

## print
umap_bulk
```

## MCA Mapping

## Build the index

```{r}
## load in mca data
counts <- read.csv("../data/Reference/scmap/allpb10x_counts.csv", row.names = 1)
pheno <- read.csv("../data/Reference/scmap/allpb10x_pheno.csv")

## change dash
rownames(counts) <- gsub("_", "-", rownames(counts))

## load MCA pal
mca_colors <- c("6"="#78C679",
            "2"="#D1EC9F",
            "0"="#FEB24C",
            "1"="#F4CF63",
            "3"="#FEEEAA",
            "4"="#85B1D3",
            "7"="#9ecae1",
            "5"="#C9E8F1",
            "M"= "#B7B7D8",
            "F"="#9C96C6",
            "unassigned"="darkgrey")

## plot
ggplot(pheno, aes(x=PC2_3d, y = PC3_3d, colour=absclust3)) + 
  geom_point() +
  theme_classic() +
  scale_color_manual(values = mca_colors, labels = (c("0 - early troph", "1 - mid troph","2 - late ring", "3 - late troph","4 - early schizont", "5 - late schizont", "6 - early ring","7 - mid schizont", "female", "male", "unassigned")))
```

```{r}
###Making an ortholog reference index

## load required libraries
library(scmap)
library(SingleCellExperiment)

#prep the SCE, if was originally a Suerat object need the dfs to be regular matrices
#pb_filtered_sce_orth <- pb_filtered_sce_orth[, colData(pb_filtered_sce_orth)$absclust3 != "8"]
#sce <- pb_filtered_sce_orth
#pca <- plotPCA(sce)
#pcs <- pca$data
#table(rownames(pcs)==colnames(sce))
#colData(sce) <- cbind(colData(sce), pcs)
#rowData(sce)$feature_symbol <- rowData(sce)$gene
sce <- SingleCellExperiment(list(counts=counts),
    colData=DataFrame(label=pheno),
    rowData=DataFrame(feature_symbol=rownames(counts)))
sce
counts_1 <- assay(sce, "counts")
libsizes <- colSums(counts_1)
size.factors <- libsizes/mean(libsizes)
logcounts(sce) <- log2(t(t(counts_1)/size.factors) + 1)
#counts(sce) <- as.matrix(counts(sce))
#logcounts(sce) <- as.matrix(logcounts(sce))
# remove features with duplicated names
sce <- sce[!duplicated(rownames(sce)), ]

#build scmap-cell reference index, save this rds
sce <- selectFeatures(sce, suppress_plot = FALSE, n_features = 500)
table(rowData(sce)$scmap_features)

set.seed(1)
sce <- indexCell(sce)
names(metadata(sce)$scmap_cell_index)
length(metadata(sce)$scmap_cell_index$subcentroids)
dim(metadata(sce)$scmap_cell_index$subcentroids[[1]])
metadata(sce)$scmap_cell_index$subcentroids[[1]][,1:5]
dim(metadata(sce)$scmap_cell_index$subclusters)

#saveRDS(pb_filtered_sce_orth, file="pb_filtered_sce_orthindex_20181109.rds")
```

## Map to the index

```{r}
## convert seurat object above into sce for this
pf_field_counts <- as.matrix(tenx.justwt.integrated@assays$RNA@data)
pf_field_pheno <- tenx.justwt.integrated@meta.data

#prep the SCE, if was originally a Suerat object need the dfs to be regular matrices
#pb_filtered_sce_orth <- pb_filtered_sce_orth[, colData(pb_filtered_sce_orth)$absclust3 != "8"]
#sce <- pb_filtered_sce_orth
#pca <- plotPCA(sce)
#pcs <- pca$data
#table(rownames(pcs)==colnames(sce))
#colData(sce) <- cbind(colData(sce), pcs)
#rowData(sce)$feature_symbol <- rowData(sce)$gene
pf_live_field.sce.orth <- SingleCellExperiment(list(counts=pf_field_counts),
    colData=DataFrame(label=pf_field_pheno),
    rowData=DataFrame(feature_symbol=rownames(pf_field_counts)))
pf_live_field.sce.orth
counts_1 <- assay(pf_live_field.sce.orth, "counts")
libsizes <- colSums(counts_1)
size.factors <- libsizes/mean(libsizes)
logcounts(pf_live_field.sce.orth) <- log2(t(t(counts_1)/size.factors) + 1)


#Project query data set onto cell index
scmapCell_results <- scmapCell(
  pf_live_field.sce.orth, 
  list(mca = metadata(sce)$scmap_cell_index
  )
)

##Look into the results
# For each dataset there are two matricies. cells matrix contains the top 10 (scmap default) cell IDs of the cells of the reference dataset that a given cell of the projection dataset is closest to:
#   
#   Give assignments in two ways:
#   1. Take the top cell assignment abs clust, if cosine similarity is less than 0.4 (or adjust if needed) mark as unassigned
# 2. For the top 3 nearest neighbors, get a mean of the PCA coordinates and snap to the nearest cell of those coordinates. If any of the top three cells are sim below 0.4 then mark as unassigned.


##Top cell assignment method
scmapCell_results$mca$cells[, 1:3]
getcells <- scmapCell_results$mca$cells[1, ]
cdsce <- colData(sce)[getcells, ]
topsim <- scmapCell_results$mca$similarities[1, ]

pf_live_field.sce.orth$topcell <- cdsce$label.sample_id
pf_live_field.sce.orth$topcell_ac <- cdsce$label.absclust3
pf_live_field.sce.orth$indexPC1 <- cdsce$label.PC1
pf_live_field.sce.orth$indexPC2 <- cdsce$label.PC2
#pf_live_field.sce.orth$pbpt <- cdsce$pseudotime
#pf_live_field.sce.orth$pbbulk <- cdsce$bulk
pf_live_field.sce.orth$topcell_sp <- pf_live_field.sce.orth$topcell_ac
pf_live_field.sce.orth$topsim <- topsim
pf_live_field.sce.orth$topcell_sp[pf_live_field.sce.orth$topsim < 0.3] <- "unassigned"
table(pf_live_field.sce.orth$topcell_sp)
```

add these metrics back into the object 

```{r}
## add the meta data from scmap to the object
# live
## extract
df_plot_live <- data.frame(topcell_sp = colData(pf_live_field.sce.orth)$topcell_sp, topsim = colData(pf_live_field.sce.orth)$topsim, row.names = rownames(colData(pf_live_field.sce.orth)))

## add to object
tenx.justwt.integrated <- AddMetaData(
  object = tenx.justwt.integrated,
  metadata = df_plot_live,
  col.name = c('mca_topcell_sp', 'mca_topsim')
)

## need to fix the NA in topcell_sp
# Un-factorize (as.numeric can be use for numeric values)
#              (as.vector  can be use for objects - not tested)
tenx.justwt.integrated@meta.data$mca_topcell_sp <- as.character(tenx.justwt.integrated@meta.data$mca_topcell_sp)
## add unassigned
tenx.justwt.integrated@meta.data$mca_topcell_sp[is.na(tenx.justwt.integrated@meta.data$mca_topcell_sp)] <- "unassigned"
# Re-factorize with the as.factor function or simple factor(fixed$Type)
tenx.justwt.integrated@meta.data$mca_topcell_sp <- as.factor(tenx.justwt.integrated@meta.data$mca_topcell_sp)
## check
table(tenx.justwt.integrated@meta.data$mca_topcell_sp)
```

```{r}
## reorder levels
tenx.justwt.integrated@meta.data$mca_topcell_sp <- factor(tenx.justwt.integrated@meta.data$mca_topcell_sp, levels = rev(c("M", "F", "5", "7", "4", "3", "1", "0", "2", "6")))

UMAP_mca <- DimPlot(tenx.justwt.integrated, pt.size = 0.1, group.by = "mca_topcell_sp", dims = c(1,2)) +
  coord_fixed() + 
  theme_void() +
  labs(title = paste("MCA Predicted Cell Type"), colour = "MCA Cell Type") + 
  theme(plot.title = element_text(hjust = 0.5, family="Arial", size = 20, face = "bold")) +
  scale_color_manual(values = mca_colors, labels = (c("6 - early ring", "2 - late ring", "0 - early troph", "1 - mid troph", "3 - late troph", "4 - early schizont", "7 - mid schizont", "5 - late schizont", "female", "male"))) +
  theme(legend.position = "right", legend.title=element_text(size=15))

UMAP_mca
```

save
```{r}
ggsave("../images_to_export/UMAP_mca.png", plot = UMAP_mca, device = "png", path = NULL, scale = 1, width = 15, height = 15, units = "cm", dpi = 300, limitsize = TRUE)
```

# 5. Clustering {.tabset} 

### Generate clusters

```{r}
## generate new clusters at low resolution
tenx.justwt.integrated <- FindNeighbors(tenx.justwt.integrated, dims = 1:19, reduction = "pca")
tenx.justwt.integrated <- FindClusters(tenx.justwt.integrated, resolution = 2, random.seed = 42, algorithm = 2)
```

### visualise clusters

```{r}
DimPlot(tenx.justwt.integrated, dims = c(1,2), reduction = "umap", pt.size = 1, order = TRUE, repel = FALSE, label = TRUE)
```

Make individual plots highlighting where cells in each cluster fall
```{r, echo = FALSE, message=FALSE}
## for loop which takes each cluster and makes a list of cells and then plots a highlighted plot and adds it to a list

## make a blank list
list_UMAPs_by_cluster <- vector(mode = "list", length = length(levels(tenx.justwt.integrated@meta.data$integrated_snn_res.2)))

## for loop
for(i in seq_along(levels(tenx.justwt.integrated@meta.data$integrated_snn_res.2))){
  ## make a list of cells
  list_of_cells <- rownames(tenx.justwt.integrated@meta.data[which(tenx.justwt.integrated@meta.data$integrated_snn_res.2 == levels(tenx.justwt.integrated@meta.data$integrated_snn_res.2)[i]), ])
  umap_plot <- DimPlot(tenx.justwt.integrated, label = FALSE, repel = TRUE, pt.size = 0.1, cells.highlight = list_of_cells, dims = c(1,2), reduction = "umap") +
    ## fix coordinates
    coord_fixed() + 
  scale_color_manual(values=c("#D3D3D3", "#1D1564")) + 
  theme_void() + 
  labs(title = paste("cluster", levels(tenx.justwt.integrated@meta.data$integrated_snn_res.2)[i])) + 
  theme(plot.title = element_text(hjust = 0.5), legend.position = "none")
  ## add to the list
  list_UMAPs_by_cluster[[i]] <- umap_plot
}

## check number of clusters
length(list_UMAPs_by_cluster)
```

plot
```{r, fig.height = 10, fig.width = 10}
## this function writes the next bit of code for you
## put it into the console and paste the response
#ploty <- c()
#for(i in seq_along(levels(tenx.justwt.integrated@meta.data$seurat_clusters))){
#  ploty <- paste0(ploty, "list_UMAPs_by_cluster[[", i, "]]", " + ")
#}

## plot
list_UMAPs_by_cluster[[1]] + list_UMAPs_by_cluster[[2]] + list_UMAPs_by_cluster[[3]] + list_UMAPs_by_cluster[[4]] + list_UMAPs_by_cluster[[5]] + list_UMAPs_by_cluster[[6]] + list_UMAPs_by_cluster[[7]] + list_UMAPs_by_cluster[[8]] + list_UMAPs_by_cluster[[9]] + list_UMAPs_by_cluster[[10]] + list_UMAPs_by_cluster[[11]] + list_UMAPs_by_cluster[[12]] + list_UMAPs_by_cluster[[13]] + list_UMAPs_by_cluster[[14]] + list_UMAPs_by_cluster[[15]] + list_UMAPs_by_cluster[[16]] + list_UMAPs_by_cluster[[17]] + list_UMAPs_by_cluster[[18]] + list_UMAPs_by_cluster[[19]] + list_UMAPs_by_cluster[[20]] + list_UMAPs_by_cluster[[21]] + list_UMAPs_by_cluster[[22]] + list_UMAPs_by_cluster[[23]] + list_UMAPs_by_cluster[[24]]
```

### Find Markers

```{r}
# PBANKA-1418100        GCSKO-17  FD3   
# PBANKA-0102400         GCSKO-2  MD4 
# PBANKA-0716500        GCSKO-19  MD5 
# PBANKA-1435200        GCSKO-20  FD4 
# PBANKA-0902300        GCSKO-13  FD2
# PBANKA-0413400    GCSKO-10_820  MD3
# PBANKA-0828000         GCSKO-3  GD1
# PBANKA-1302700       GCSKO-oom  MD1 
# PBANKA-1447900        GCSKO-29  MD2
# PBANKA-1454800        GCSKO-21  FD1
# PBANKA-1144800        GCSKO-28  FD5

DefaultAssay(tenx.justwt.integrated) <- "RNA"
# find markers for every cluster compared to all remaining cells, report only the positive ones
pb.markers <- FindAllMarkers(tenx.justwt.integrated, only.pos = TRUE, min.pct = 0.25, logfc.threshold = 0.25, test.use = "MAST")
pb.markers %>% group_by(cluster) %>% top_n(n = 5, wt = avg_logFC)
```

```{r}
markers_annotated <- merge(pb.markers, gene_annotations,  by.x = "gene", by.y = "Gene.ID", all = FALSE)
View(markers_annotated)
```

```{r, R.options = list(width = 300)}
markers_subset <- pb.markers %>% group_by(cluster) %>% top_n(n = 20, wt = avg_logFC)
markers_subset_annotated <- merge(markers_subset, gene_annotations,  by.x = "gene", by.y = "Gene.ID", all = FALSE)
markers_subset_annotated
View(markers_subset_annotated)
```

```{r}
View(markers_annotated_mutant <- markers_annotated[which(markers_annotated$gene %in% list_of_mutant_genes), ])
```


### visualise markers

```{r, fig.height = 10, fig.width = 10}
## get the top 10 for each cluster
top10 <- pb.markers %>% group_by(cluster) %>% top_n(n = 10, wt = avg_logFC)

## scale data is missing - but original dfs have scale.data
# df_scaled <- as.matrix(transform(merge(as.data.frame(tenx.justwt.list[[1]]@assays$RNA@scale.data), as.data.frame(tenx.justwt.list[[2]]@assays$RNA@scale.data), by = 0, all=FALSE), row.names=Row.names, Row.names=NULL))
# tenx.justwt.integrated@assays$RNA@scale.data <- df_scaled

## scale data in RNA slot
all.genes <- rownames(tenx.justwt.integrated)
tenx.justwt.integrated <- ScaleData(tenx.justwt.integrated, vars.to.regress = 'experiment', features = all.genes)

## heatmap
DoHeatmap(tenx.justwt.integrated, features = top10$gene) + NoLegend()
```

# 6. Define Cluster Identities {.tabset} 

### Downsampling
```{r}
# ## check number of cells in each cluster
# df_clusters <- as.data.frame(table(tenx.justwt.integrated@meta.data$integrated_snn_res.2))
# plot(df_clusters)
# 
# ## downsample using appropriate metrics
# tenx.justwt.integrated.downsampled <- subset(tenx.justwt.integrated, downsample = 100)
#   
# ## inspect plot
# FeaturePlot(tenx.justwt.integrated.downsampled, features = "PBANKA-1437500", coord.fixed = TRUE, dims = c(1,2), reduction = "umap", pt.size = 1, order = TRUE) + 
#   theme_void() + 
#   labs(title = paste("AP2G (Commitment)")) + 
#   theme(plot.title = element_text(hjust = 0.5, family="Arial", size = 20, face = "bold")) + 
#   scale_colour_gradientn(colours=c("#DCDCDC", plasma(30)))
```

### Marker Genes

useful tools for all plots
```{r}
## define male and female symbol
female_symbol <- intToUtf8(9792)
male_symbol <- intToUtf8(9794)

## define colour pal
marker_gene_ramp <- colorRampPalette(c("#D3D3D3", "#1D1564"))(50)
```

#### Expression - cutoffs - purple

```{r, fig.height = 15, fig.width = 15}

marker_gene_plot_CCP2 <- FeaturePlot(tenx.justwt.integrated, 
                                     features = "PBANKA-1319500", 
                                     coord.fixed = TRUE, 
                                     min.cutoff = "q05", 
                                     max.cutoff = "q95", 
                                     dims = c(1,2), 
                                     reduction = "umap", 
                                     pt.size = 0.5, 
                                     order = TRUE, 
                                     slot = 'scale.data') + 
                          coord_fixed() +
                          theme_void() + 
                          labs(title = expression(italic(ccp2)~(Female))) + 
                          theme(plot.title = element_text(hjust = 0.5), 
                                legend.text = element_text(size = 5), 
                                text=element_text(size=6.0), 
                                plot.margin = unit(c(0, 0, 0, 0), "cm")
                                ) + 
                          scale_color_continuous_sequential(palette = "Purples 2", labels = number_format(accuracy = 0.1)) +
                          guides(colour = guide_colourbar(barwidth = 0.3, barheight = 3.0, title = ""))

marker_gene_plot_MG1 <- FeaturePlot(tenx.justwt.integrated, 
                                    features = "PBANKA-0416100", 
                                    coord.fixed = TRUE, 
                                    min.cutoff = "q05", 
                                    max.cutoff = "q95", 
                                    dims = c(1,2), 
                                    reduction = "umap", 
                                    pt.size = 0.5, 
                                    order = TRUE, 
                                    slot = 'scale.data') +
                        coord_fixed() + 
                        theme_void() + 
                        labs(title = expression(italic(mg1)~(Male))) + 
                        theme(plot.title = element_text(hjust = 0.5), legend.text = element_text(size = 5), text=element_text(size=6.0), plot.margin = unit(c(0, 0, 0, 0), "cm")) + 
                        scale_color_continuous_sequential(palette = "Purples 2", labels = number_format(accuracy = 0.1)) +
                          guides(colour = guide_colourbar(barwidth = 0.3, barheight = 3.0, title = ""))

marker_gene_plot_AP2G <- FeaturePlot(tenx.justwt.integrated, 
                                     features = "PBANKA-1437500", 
                                     coord.fixed = TRUE, 
                                     min.cutoff = "q05", 
                                     max.cutoff = "q95", 
                                     dims = c(1,2), 
                                     reduction = "umap", 
                                     pt.size = 0.5, 
                                     order = TRUE, 
                                     slot = 'scale.data') + 
                          coord_fixed() +
                          theme_void() + 
                          labs(title = expression(italic(ap2g)~(Commitment))) + 
                          theme(plot.title = element_text(hjust = 0.5), legend.text = element_text(size = 5), text=element_text(size=6.0), plot.margin = unit(c(0, 0, 0, 0), "cm")) + 
                          scale_color_continuous_sequential(palette = "Purples 2", labels = number_format(accuracy = 0.1)) +
                          guides(colour = guide_colourbar(barwidth = 0.3, barheight = 3.0, title = ""))

marker_gene_plot_MSP1 <- FeaturePlot(tenx.justwt.integrated, 
                                     features = "PBANKA-0831000", 
                                     coord.fixed = TRUE, 
                                     min.cutoff = "q05", 
                                     max.cutoff = "q95", 
                                     dims = c(1,2), 
                                     reduction = "umap", 
                                     pt.size = 0.5, 
                                     order = TRUE, 
                                     slot = 'scale.data') + 
                          coord_fixed() +
                          theme_void() + 
                          labs(title = expression(italic(msp1)~(Schizont))) + 
                          theme(plot.title = element_text(hjust = 0.5), legend.text = element_text(size = 5), text=element_text(size=6.0), plot.margin = unit(c(0, 0, 0, 0), "cm")) + 
                          scale_color_continuous_sequential(palette = "Purples 2", labels = number_format(accuracy = 0.1)) +
                          guides(colour = guide_colourbar(barwidth = 0.3, barheight = 3.0, title = ""))

marker_gene_plot_MSP8 <- FeaturePlot(tenx.justwt.integrated, 
                                     features = "PBANKA-1102200", 
                                     coord.fixed = TRUE, 
                                     min.cutoff = "q05", 
                                     max.cutoff = "q95", 
                                     dims = c(1,2), 
                                     reduction = "umap", 
                                     pt.size = 0.5, 
                                     order = TRUE, 
                                     slot = 'scale.data') + 
                          coord_fixed() +
                          theme_void() + 
                          labs(title = expression(italic(msp8)~(Asexual))) + 
                          theme(plot.title = element_text(hjust = 0.5), legend.text = element_text(size = 5), text=element_text(size=6.0), plot.margin = unit(c(0, 0, 0, 0), "cm")) + 
                          scale_color_continuous_sequential(palette = "Purples 2", labels = number_format(accuracy = 0.1)) +
                          guides(colour = guide_colourbar(barwidth = 0.3, barheight = 3.0, title = ""))

marker_gene_plot_SBP1 <- FeaturePlot(tenx.justwt.integrated, 
                                     features = "PBANKA-1101300", 
                                     coord.fixed = TRUE, 
                                     min.cutoff = "q05", 
                                     max.cutoff = "q95", 
                                     dims = c(1,2), 
                                     reduction = "umap", 
                                     pt.size = 0.5, 
                                     order = TRUE,
                                     slot = 'scale.data') +
                          coord_fixed() + 
                          theme_void() + 
                          labs(title = expression(italic(SBP1)~(Ring))) + 
                          theme(plot.title = element_text(hjust = 0.5), legend.text = element_text(size = 5), text=element_text(size=6.0), plot.margin = unit(c(0, 0, 0, 0), "cm")) + 
                          scale_color_continuous_sequential(palette = "Purples 2", labels = number_format(accuracy = 0.1)) +
                          guides(colour = guide_colourbar(barwidth = 0.3, barheight = 3.0, title = ""))

marker_gene_plot_FAMB <- FeaturePlot(tenx.justwt.integrated, 
                                     features = "PBANKA-0722600", 
                                     coord.fixed = TRUE, 
                                     min.cutoff = "q05", 
                                     max.cutoff = "q95", 
                                     dims = c(1,2), 
                                     reduction = "umap", 
                                     pt.size = 0.5, 
                                     order = TRUE,
                                     slot = 'scale.data') +
                          coord_fixed() + 
                          theme_void() + 
                          labs(title = expression(italic(fam-b2)~(Ring))) + 
                          theme(plot.title = element_text(hjust = 0.5), legend.text = element_text(size = 5), text=element_text(size=6.0), plot.margin = unit(c(0, 0, 0, 0), "cm")) + 
                          scale_color_continuous_sequential(palette = "Purples 2", labels = number_format(accuracy = 0.1)) +
                          guides(colour = guide_colourbar(barwidth = 0.3, barheight = 3.0, title = ""))

marker_gene_plot_ORC1 <- FeaturePlot(tenx.justwt.integrated, 
                                     features = "PBANKA-0602000", 
                                     coord.fixed = TRUE, 
                                     min.cutoff = "q05", 
                                     max.cutoff = "q95", 
                                     dims = c(1,2), 
                                     reduction = "umap", 
                                     pt.size = 0.5, 
                                     order = TRUE,
                                     slot = 'scale.data') +
                          coord_fixed() + 
                          theme_void() + 
                          labs(title = expression(italic(orc1)~(DNA~replication))) + 
                          theme(plot.title = element_text(hjust = 0.5), legend.text = element_text(size = 5), text=element_text(size=6.0), plot.margin = unit(c(0, 0, 0, 0), "cm")) + 
                          scale_color_continuous_sequential(palette = "Purples 2", labels = number_format(accuracy = 0.1)) +
                          guides(colour = guide_colourbar(barwidth = 0.3, barheight = 3.0, title = ""))

marker_gene_plot_CDPK5 <- FeaturePlot(tenx.justwt.integrated, 
                                      features = "PBANKA-1351500", 
                                      coord.fixed = TRUE, 
                                      min.cutoff = "q05", 
                                      max.cutoff = "q95", 
                                      dims = c(1,2), 
                                      reduction = "umap", 
                                      pt.size = 0.5, 
                                      order = TRUE, 
                                      slot = 'scale.data') +
                          coord_fixed() +
                          theme_void() + 
                          labs(title = expression(italic(cdpk5)~(Schizont))) + 
                          theme(plot.title = element_text(hjust = 0.5), legend.text = element_text(size = 5), text=element_text(size=6.0), plot.margin = unit(c(0, 0, 0, 0), "cm")) + 
                          scale_color_continuous_sequential(palette = "Purples 2", labels = number_format(accuracy = 0.1)) +
                          guides(colour = guide_colourbar(barwidth = 0.3, barheight = 3.0, title = ""))

marker_gene_plot_MCM4 <- FeaturePlot(tenx.justwt.integrated, 
                                      features = "PBANKA-1415600", 
                                      coord.fixed = TRUE, 
                                      min.cutoff = "q05", 
                                      max.cutoff = "q95", 
                                      dims = c(1,2), 
                                      reduction = "umap", 
                                      pt.size = 0.5, 
                                      order = TRUE, 
                                      slot = 'scale.data') +
                          coord_fixed() +
                          theme_void() + 
                          labs(title = expression(italic(mcm4)~(DNA~replication))) + 
                          theme(plot.title = element_text(hjust = 0.5), legend.text = element_text(size = 5), text=element_text(size=6.0), plot.margin = unit(c(0, 0, 0, 0), "cm")) + 
                          scale_color_continuous_sequential(palette = "Purples 2", labels = number_format(accuracy = 0.1)) +
                          guides(colour = guide_colourbar(barwidth = 0.3, barheight = 3.0, title = ""))

# + scale_colour_gradientn(colours=c("#DCDCDC", plasma(30)))

## plot
## cowplot method
marker_gene_plot_all <- plot_grid(marker_gene_plot_FAMB, marker_gene_plot_MSP8, marker_gene_plot_MSP1, marker_gene_plot_AP2G, marker_gene_plot_CCP2, marker_gene_plot_MG1, marker_gene_plot_HSP70, nrow=3)

marker_gene_plot_all
```

### Mutant Genes

#### Expression - with cutoffs

```{r, fig.height = 10, fig.width = 10}
# PBANKA-1418100        GCSKO-17  FD3   
# PBANKA-0102400         GCSKO-2  MD4 
# PBANKA-0716500        GCSKO-19  MD5 
# PBANKA-1435200        GCSKO-20  FD4 
# PBANKA-0902300        GCSKO-13  FD2
# PBANKA-0413400    GCSKO-10_820  MD3
# PBANKA-0828000         GCSKO-3  GD1
# PBANKA-1302700       GCSKO-oom  MD1 
# PBANKA-1447900        GCSKO-29  MD2
# PBANKA-1454800        GCSKO-21  FD1
# PBANKA-1144800        GCSKO-28  FD5


marker_gene_plot_fd3 <- FeaturePlot(tenx.justwt.integrated, 
                                   dims = c(1,2), 
                                   reduction = "umap", 
                                   features = "PBANKA-1418100", 
                                   coord.fixed = TRUE,
                                   min.cutoff = "q05", 
                                   max.cutoff = "q95", 
                                   pt.size = 0.5, 
                                   order = TRUE, 
                                   slot = 'scale.data') + 
                        theme_void() + 
                        labs(title = expression(italic(fd3))) + 
                        theme(plot.title = element_text(hjust = 0.5), 
                                legend.text = element_text(size = 5), 
                                text=element_text(size=7), 
                                plot.margin = unit(c(0, 0, 0, 0), "cm")
                                ) + 
                          scale_color_continuous_sequential(palette = "Purples 2", labels = number_format(accuracy = 0.1)) +
                          guides(colour = guide_colourbar(barwidth = 0.3, barheight = 3.0, title = ""))
                        ## add sex symbols
                        #annotate("text", x = 3.8, y = 1.5, label = male_symbol, size=7, color="gray") + 
                        #annotate("text", x = 2, y = 2.8, label = female_symbol, size=7, color="gray")

marker_gene_plot_md4 <- FeaturePlot(tenx.justwt.integrated, 
                                  dims = c(1,2), 
                                  reduction = "umap", 
                                  features = "PBANKA-0102400", 
                                  coord.fixed = TRUE, 
                                  min.cutoff = "q05",
                                  max.cutoff = "q95", 
                                  pt.size = 0.5, 
                                  order = TRUE, 
                                  slot = 'scale.data') + 
                      theme_void() + 
                      labs(title = expression(italic(md4))) + 
                      theme(plot.title = element_text(hjust = 0.5), 
                                legend.text = element_text(size = 5), 
                                text=element_text(size=7), 
                                plot.margin = unit(c(0, 0, 0, 0), "cm")
                                ) + 
                          scale_color_continuous_sequential(palette = "Purples 2", labels = number_format(accuracy = 0.1)) +
                          guides(colour = guide_colourbar(barwidth = 0.3, barheight = 3.0, title = ""))
                      ## add sex symbols
                      #annotate("text", x = 3.8, y = 1.5, label = male_symbol, size=7, color="gray") + 
                      #annotate("text", x = 2, y = 2.8, label = female_symbol, size=7, color="gray")

marker_gene_plot_md5 <- FeaturePlot(tenx.justwt.integrated, 
                                   dims = c(1,2), 
                                   reduction = "umap", 
                                   features = "PBANKA-0716500", 
                                   coord.fixed = TRUE, 
                                   min.cutoff = "q05", 
                                   max.cutoff = "q95",
                                   pt.size = 0.5, 
                                   order = TRUE, 
                                   slot = 'scale.data') + 
                       theme_void() + 
                       labs(title = expression(italic(md5))) + 
                       theme(plot.title = element_text(hjust = 0.5), 
                                legend.text = element_text(size = 5), 
                                text=element_text(size=7), 
                                plot.margin = unit(c(0, 0, 0, 0), "cm")
                                ) + 
                          scale_color_continuous_sequential(palette = "Purples 2", labels = number_format(accuracy = 0.1)) +
                          guides(colour = guide_colourbar(barwidth = 0.3, barheight = 3.0, title = "")) 
                       ## add sex symbols
                       #annotate("text", x = 3.8, y = 1.5, label = male_symbol, size=7, color="gray") + 
                       #annotate("text", x = 2, y = 2.8, label = female_symbol, size=7, color="gray")

marker_gene_plot_fd4 <- FeaturePlot(tenx.justwt.integrated, 
                                   dims = c(1,2), 
                                   reduction = "umap", 
                                   features = "PBANKA-1435200", 
                                   coord.fixed = TRUE, 
                                   min.cutoff = "q05", 
                                   max.cutoff = "q95", 
                                   pt.size = 0.5, 
                                   order = TRUE, 
                                   slot = 'scale.data') + 
                       theme_void() + 
                       labs(title = expression(italic(fd4))) + 
                       theme(plot.title = element_text(hjust = 0.5), 
                                legend.text = element_text(size = 5), 
                                text=element_text(size=7), 
                                plot.margin = unit(c(0, 0, 0, 0), "cm")
                                ) + 
                          scale_color_continuous_sequential(palette = "Purples 2", labels = number_format(accuracy = 0.1)) +
                          guides(colour = guide_colourbar(barwidth = 0.3, barheight = 3.0, title = "")) 
                       ## add sex symbols
                       #annotate("text", x = 3.8, y = 1.5, label = male_symbol, size=7, color="gray") + 
                       #annotate("text", x = 2, y = 2.8, label = female_symbol, size=7, color="gray")

marker_gene_plot_fd2 <- FeaturePlot(tenx.justwt.integrated, 
                                    dims = c(1,2), 
                                    reduction = "umap", 
                                    features = "PBANKA-0902300", 
                                    coord.fixed = TRUE, 
                                    min.cutoff = "q05", 
                                    max.cutoff = "q95", 
                                    pt.size = 0.5, 
                                    order = TRUE, 
                                    slot = 'scale.data') + 
                       theme_void() + 
                       labs(title = expression(italic(fd2))) + 
                       theme(plot.title = element_text(hjust = 0.5), 
                                legend.text = element_text(size = 5), 
                                text=element_text(size=7), 
                                plot.margin = unit(c(0, 0, 0, 0), "cm")
                                ) + 
                          scale_color_continuous_sequential(palette = "Purples 2", labels = number_format(accuracy = 0.1)) +
                          guides(colour = guide_colourbar(barwidth = 0.3, barheight = 3.0, title = ""))
                       ## add sex symbols
                       # + annotate("text", x = 3.8, y = 1.5, label = male_symbol, size=7, color="gray") + 
                       #annotate("text", x = 2, y = 2.8, label = female_symbol, size=7, color="gray")

marker_gene_plot_md3 <- FeaturePlot(tenx.justwt.integrated, 
                                   dims = c(1,2), 
                                   reduction = "umap", 
                                   features = "PBANKA-0413400", 
                                   coord.fixed = TRUE, 
                                   min.cutoff = "q05", 
                                   max.cutoff = "q95",
                                   pt.size = 0.5, 
                                   order = TRUE, 
                                   slot = 'scale.data') + 
                       theme_void() + 
                       labs(title = expression(italic(md3))) + 
                       theme(plot.title = element_text(hjust = 0.5), 
                                legend.text = element_text(size = 5), 
                                text=element_text(size=7), 
                                plot.margin = unit(c(0, 0, 0, 0), "cm")
                                ) + 
                          scale_color_continuous_sequential(palette = "Purples 2", labels = number_format(accuracy = 0.1)) +
                          guides(colour = guide_colourbar(barwidth = 0.3, barheight = 3.0, title = ""))
                       ## add sex symbols
                      #annotate("text", x = 3.8, y = 1.5, label = male_symbol, size=7, color="gray") + 
                       #annotate("text", x = 2, y = 2.8, label = female_symbol, size=7, color="gray")

marker_gene_plot_gd1 <- FeaturePlot(tenx.justwt.integrated, 
                                  dims = c(1,2), 
                                  reduction = "umap",
                                  features = "PBANKA-0828000", 
                                  coord.fixed = TRUE, 
                                  min.cutoff = "q05", 
                                  max.cutoff = "q95",
                                  pt.size = 0.5, 
                                  order = TRUE, 
                                  slot = 'scale.data') + 
                       theme_void() + 
                       labs(title = expression(italic(gd1))) + 
                       theme(plot.title = element_text(hjust = 0.5), 
                                legend.text = element_text(size = 5), 
                                text=element_text(size=7), 
                                plot.margin = unit(c(0, 0, 0, 0), "cm")
                                ) + 
                          scale_color_continuous_sequential(palette = "Purples 2", labels = number_format(accuracy = 0.1)) +
                          guides(colour = guide_colourbar(barwidth = 0.3, barheight = 3.0, title = ""))
                       ## add sex symbols
                       #annotate("text", x = 3.8, y = 1.5, label = male_symbol, size=7, color="gray") + 
                       #annotate("text", x = 2, y = 2.8, label = female_symbol, size=7, color="gray")

marker_gene_plot_md1 <- FeaturePlot(tenx.justwt.integrated, 
                                    dims = c(1,2), 
                                    reduction = "umap", 
                                    features = "PBANKA-1302700",
                                    coord.fixed = TRUE,
                                    min.cutoff = "q05", 
                                    max.cutoff = "q95", 
                                    pt.size = 0.5, 
                                    order = TRUE, 
                                    slot = 'scale.data') + 
                       theme_void() + 
                       labs(title = expression(italic(md1))) + 
                       theme(plot.title = element_text(hjust = 0.5), 
                                legend.text = element_text(size = 5), 
                                text=element_text(size=7), 
                                plot.margin = unit(c(0, 0, 0, 0), "cm")
                                ) + 
                          scale_color_continuous_sequential(palette = "Purples 2", labels = number_format(accuracy = 0.1)) +
                          guides(colour = guide_colourbar(barwidth = 0.3, barheight = 3.0, title = ""))
                       ## add sex symbols
                       #annotate("text", x = 3.8, y = 1.5, label = male_symbol, size=7, color="gray") + 
                       #annotate("text", x = 2, y = 2.8, label = female_symbol, size=7, color="gray")

marker_gene_plot_md2 <- FeaturePlot(tenx.justwt.integrated,
                                   dims = c(1,2), 
                                   reduction = "umap",
                                   features = "PBANKA-1447900",
                                   coord.fixed = TRUE,
                                   min.cutoff = "q05", 
                                   max.cutoff = "q95", 
                                   pt.size = 0.5, 
                                   order = TRUE, 
                                   slot = 'scale.data') + 
                       theme_void() + 
                       labs(title = expression(italic(md2))) + 
                       theme(plot.title = element_text(hjust = 0.5), 
                                legend.text = element_text(size = 5), 
                                text=element_text(size=7), 
                                plot.margin = unit(c(0, 0, 0, 0), "cm")
                                ) + 
                          scale_color_continuous_sequential(palette = "Purples 2", labels = number_format(accuracy = 0.1)) +
                          guides(colour = guide_colourbar(barwidth = 0.3, barheight = 3.0, title = ""))
                       ## add sex symbols
                       #annotate("text", x = 3.8, y = 1.5, label = male_symbol, size=7, color="gray") + 
                       #annotate("text", x = 2, y = 2.8, label = female_symbol, size=7, color="gray")

marker_gene_plot_fd1 <- FeaturePlot(tenx.justwt.integrated, 
                                    dims = c(1,2), 
                                    reduction = "umap", 
                                    features = "PBANKA-1454800", 
                                    coord.fixed = TRUE, 
                                    min.cutoff = "q05",
                                    max.cutoff = "q95", 
                                    pt.size = 0.5, 
                                    order = TRUE, 
                                    slot = 'scale.data') + 
                       theme_void() + 
                       labs(title = expression(italic(fd1))) + 
                       theme(plot.title = element_text(hjust = 0.5), 
                                legend.text = element_text(size = 5), 
                                text=element_text(size=7), 
                                plot.margin = unit(c(0, 0, 0, 0), "cm")
                                ) + 
                          scale_color_continuous_sequential(palette = "Purples 2", labels = number_format(accuracy = 0.1)) +
                          guides(colour = guide_colourbar(barwidth = 0.3, barheight = 3.0, title = ""))
                       ## add sex symbols
                       #annotate("text", x = 3.8, y = 1.5, label = male_symbol, size=7, color="gray") + 
                       #annotate("text", x = 2, y = 2.8, label = female_symbol, size=7, color="gray")

##original label:
# labs(title = paste("(CCP2; Female)","\n", "PBANKA_1319500"))
# scale_color_continuous_sequential(palette = "Purples 2")
# scale_colour_gradientn(colours=c("#DCDCDC", plasma(30)))

## make composite plot
mutant_expression_composite <- wrap_plots(marker_gene_plot_gd1 , marker_gene_plot_md1 , marker_gene_plot_md2 , marker_gene_plot_md3 , marker_gene_plot_md4 , marker_gene_plot_md5 , marker_gene_plot_fd1 , marker_gene_plot_fd2 , marker_gene_plot_fd3 , marker_gene_plot_fd4, ncol = 4)
           
## print
mutant_expression_composite
```

save
```{r}
ggsave("../images_to_export/ALL_CELLS_wt_gene_expression.png", plot = mutant_expression_composite, device = "png", path = NULL, scale = 1, width = 21, height = 29.7, units = "cm", dpi = 300, limitsize = TRUE)
```

```{r, fig.height = 10, fig.width = 10}
# PBANKA-1418100        GCSKO-17  FD3   
# PBANKA-0102400         GCSKO-2  MD3 
# PBANKA-0716500        GCSKO-19  MD4 
# PBANKA-1435200        GCSKO-20  FD4 
# PBANKA-0902300        GCSKO-13  FD2
# PBANKA-0413400    GCSKO-10_820  MD5
# PBANKA-0828000         GCSKO-3  GD1
# PBANKA-1302700       GCSKO-oom  MD1 
# PBANKA-1447900        GCSKO-29  MD2
# PBANKA-1454800        GCSKO-21  FD1
# PBANKA-1144800        GCSKO-28  FD5


marker_gene_plot_17 <- FeaturePlot(tenx.justwt.integrated, dims = c(1,2), reduction = "umap", features = "PBANKA-1418100", coord.fixed = TRUE, min.cutoff = "q10", max.cutoff = "q90", pt.size = 1, order = TRUE) + 
  theme_void() + 
  labs(title = paste("fd3")) + 
  theme(plot.title = element_text(hjust = 0.5, family="Arial", size = 20, face = "bold")) + 
  scale_color_continuous_sequential(palette = "Purples 2") +
  ## add sex symbols
  annotate("text", x = 3.8, y = 1.5, label = male_symbol, size=7, color="gray") + 
  annotate("text", x = 2, y = 2.8, label = female_symbol, size=7, color="gray")

marker_gene_plot_2 <- FeaturePlot(tenx.justwt.integrated, dims = c(1,2), reduction = "umap", features = "PBANKA-0102400", coord.fixed = TRUE, min.cutoff = "q10", max.cutoff = "q90", pt.size = 1, order = TRUE) + 
  theme_void() + 
  labs(title = paste("md3")) + 
  theme(plot.title = element_text(hjust = 0.5, family="Arial", size = 20, face = "bold")) + 
  scale_color_continuous_sequential(palette = "Purples 2") +
  ## add sex symbols
  annotate("text", x = 3.8, y = 1.5, label = male_symbol, size=7, color="gray") + 
  annotate("text", x = 2, y = 2.8, label = female_symbol, size=7, color="gray")

marker_gene_plot_19 <- FeaturePlot(tenx.justwt.integrated, dims = c(1,2), reduction = "umap", features = "PBANKA-0716500", coord.fixed = TRUE, min.cutoff = "q10", max.cutoff = "q90", pt.size = 1, order = TRUE) + 
  theme_void() + 
  labs(title = paste("md4")) + 
  theme(plot.title = element_text(hjust = 0.5, family="Arial", size = 20, face = "bold")) + 
  scale_color_continuous_sequential(palette = "Purples 2") +
  ## add sex symbols
  annotate("text", x = 3.8, y = 1.5, label = male_symbol, size=7, color="gray") + 
  annotate("text", x = 2, y = 2.8, label = female_symbol, size=7, color="gray")

marker_gene_plot_20 <- FeaturePlot(tenx.justwt.integrated, dims = c(1,2), reduction = "umap", features = "PBANKA-1435200", coord.fixed = TRUE, min.cutoff = "q10", max.cutoff = "q90", pt.size = 1, order = TRUE) + 
  theme_void() + 
  labs(title = paste("fd4")) + 
  theme(plot.title = element_text(hjust = 0.5, family="Arial", size = 20, face = "bold")) + 
  scale_color_continuous_sequential(palette = "Purples 2") +
  ## add sex symbols
  annotate("text", x = 3.8, y = 1.5, label = male_symbol, size=7, color="gray") + 
  annotate("text", x = 2, y = 2.8, label = female_symbol, size=7, color="gray")

marker_gene_plot_13 <- FeaturePlot(tenx.justwt.integrated, dims = c(1,2), reduction = "umap", features = "PBANKA-0902300", coord.fixed = TRUE, min.cutoff = "q10", max.cutoff = "q90", pt.size = 1, order = TRUE) + 
  theme_void() + 
  labs(title = paste("fd2")) + 
  theme(plot.title = element_text(hjust = 0.5, family="Arial", size = 20, face = "bold")) + 
  scale_color_continuous_sequential(palette = "Purples 2") +
  ## add sex symbols
  annotate("text", x = 3.8, y = 1.5, label = male_symbol, size=7, color="gray") + 
  annotate("text", x = 2, y = 2.8, label = female_symbol, size=7, color="gray")

marker_gene_plot_10 <- FeaturePlot(tenx.justwt.integrated, dims = c(1,2), reduction = "umap", features = "PBANKA-0413400", coord.fixed = TRUE, min.cutoff = "q10", max.cutoff = "q90", pt.size = 1, order = TRUE) + 
  theme_void() + 
  labs(title = paste("md5")) + 
  theme(plot.title = element_text(hjust = 0.5, family="Arial", size = 20, face = "bold")) + 
  scale_color_continuous_sequential(palette = "Purples 2") +
  ## add sex symbols
  annotate("text", x = 3.8, y = 1.5, label = male_symbol, size=7, color="gray") + 
  annotate("text", x = 2, y = 2.8, label = female_symbol, size=7, color="gray")

marker_gene_plot_3 <- FeaturePlot(tenx.justwt.integrated, dims = c(1,2), reduction = "umap", features = "PBANKA-0828000", coord.fixed = TRUE, min.cutoff = "q10", max.cutoff = "q90", pt.size = 1, order = TRUE) + 
  theme_void() + 
  labs(title = paste("gd1")) + 
  theme(plot.title = element_text(hjust = 0.5, family="Arial", size = 20, face = "bold")) + 
  scale_color_continuous_sequential(palette = "Purples 2") +
  ## add sex symbols
  annotate("text", x = 3.8, y = 1.5, label = male_symbol, size=7, color="gray") + 
  annotate("text", x = 2, y = 2.8, label = female_symbol, size=7, color="gray")

marker_gene_plot_oom <- FeaturePlot(tenx.justwt.integrated, dims = c(1,2), reduction = "umap", features = "PBANKA-1302700", coord.fixed = TRUE, min.cutoff = "q10", max.cutoff = "q90", pt.size = 1, order = TRUE) + 
  theme_void() + 
  labs(title = paste("md1")) + 
  theme(plot.title = element_text(hjust = 0.5, family="Arial", size = 20, face = "bold")) + 
  scale_color_continuous_sequential(palette = "Purples 2") +
  ## add sex symbols
  annotate("text", x = 3.8, y = 1.5, label = male_symbol, size=7, color="gray") + 
  annotate("text", x = 2, y = 2.8, label = female_symbol, size=7, color="gray")

marker_gene_plot_29 <- FeaturePlot(tenx.justwt.integrated, dims = c(1,2), reduction = "umap", features = "PBANKA-1447900", coord.fixed = TRUE, min.cutoff = "q10", max.cutoff = "q90", pt.size = 1, order = TRUE) + 
  theme_void() + 
  labs(title = paste("md2")) + 
  theme(plot.title = element_text(hjust = 0.5, family="Arial", size = 20, face = "bold")) + 
  scale_color_continuous_sequential(palette = "Purples 2") +
  ## add sex symbols
  annotate("text", x = 3.8, y = 1.5, label = male_symbol, size=7, color="gray") + 
  annotate("text", x = 2, y = 2.8, label = female_symbol, size=7, color="gray")

marker_gene_plot_21 <- FeaturePlot(tenx.justwt.integrated, dims = c(1,2), reduction = "umap", features = "PBANKA-1454800", coord.fixed = TRUE, min.cutoff = "q10", max.cutoff = "q90", pt.size = 1, order = TRUE) + 
  theme_void() + 
  labs(title = paste("fd1")) + 
  theme(plot.title = element_text(hjust = 0.5, family="Arial", size = 20, face = "bold")) + 
  scale_color_continuous_sequential(palette = "Purples 2") +
  ## add sex symbols
  annotate("text", x = 3.8, y = 1.5, label = male_symbol, size=7, color="gray") + 
  annotate("text", x = 2, y = 2.8, label = female_symbol, size=7, color="gray")

##original label:
# labs(title = paste("(CCP2; Female)","\n", "PBANKA_1319500"))

## make composite plot
mutant_expression_composite <- wrap_plots(marker_gene_plot_17 , marker_gene_plot_2 , marker_gene_plot_19 , marker_gene_plot_20 , marker_gene_plot_13 , marker_gene_plot_10 , marker_gene_plot_3 , marker_gene_plot_oom , marker_gene_plot_29 , marker_gene_plot_21 , ncol = 4)
           
## print
mutant_expression_composite
```



# 7. Pseudotime on all cells {.tabset}

### Pseudotime calculation

```{r}
## extract data from Seurat
seurat.object.all <- tenx.justwt.integrated
# counts
data <- as(as.matrix(GetAssayData(seurat.object.all, assay = "integrated", slot = "data")), 'sparseMatrix')
# meta data
pd <- data.frame(seurat.object.all@meta.data)

## keep only the columns that are relevant
#pData <- pd %>% select(orig.ident, nCount_RNA, nFeature_RNA)
## add gene short name
fData <- data.frame(gene_short_name = row.names(data), row.names = row.names(data))

## Construct monocle cds
monocle.object.all <- new_cell_data_set(expression_data = data, cell_metadata = pd, gene_metadata = fData)
## preprocess
monocle.object.all = preprocess_cds(monocle.object.all, num_dim = 100, norm_method = "none")
## plot variance explained plot
#plot_pc_variance_explained(monocle.object.all)
## make monocle UMAP
#monocle.object.all = reduce_dimension(monocle.object.all, reduction_method = "UMAP", preprocess_method = "PCA", umap.metric = "euclidean", umap.n_neighbors = 20, umap.min_dist = 0.5, verbose = FALSE)
#plot_cells(monocle.object.all)

## add UMAP from Seurat
monocle.object.all@int_colData@listData$reducedDims@listData[["UMAP"]] <-seurat.object.all@reductions[["umap"]]@cell.embeddings 
plot_cells(monocle.object.all)

## cluster
monocle.object.all = cluster_cells(monocle.object.all)

## plot clusters
plot_cells(monocle.object.all, color_cells_by="partition", group_cells_by="partition",  x = 1, y = 2)

## reduce partitions to 1
monocle.object.all@clusters$UMAP$partitions[monocle.object.all@clusters$UMAP$partitions == "2"] <- "1"

#map pseudotime
monocle.object.all = learn_graph(monocle.object.all, learn_graph_control=list(ncenter=550, minimal_branch_len = 15), use_partition = FALSE)

plot_cells(monocle.object.all, color_cells_by="partition", group_cells_by="partition",  x = 1, y = 2)

## a helper function to identify the root principal points:
## make cluster 2 the root
# get_earliest_principal_node <- function(cds, time_bin="7"){
#   cell_ids <- which(colData(cds)[, "seurat_clusters"] == time_bin)
#   closest_vertex <-
#   cds@principal_graph_aux[["UMAP"]]$pr_graph_cell_proj_closest_vertex
#   closest_vertex <- as.matrix(closest_vertex[colnames(cds), ])
#   root_pr_nodes <-
#   igraph::V(principal_graph(cds)[["UMAP"]])$name[as.numeric(names
#   (which.max(table(closest_vertex[cell_ids,]))))]
#   
#   root_pr_nodes
# }

## calculate pseudotime
#monocle.object.all = order_cells(monocle.object.all, root_pr_nodes=get_earliest_principal_node(monocle.object.all))
monocle.object.all = order_cells(monocle.object.all)
## used 5 points at the beginning


## plot
umap_pt <- plot_cells(monocle.object.all, color_cells_by = "pseudotime", label_cell_groups=FALSE, cell_size = 1, x = 1, y = 2, label_branch_points=FALSE, label_leaves=FALSE, label_groups_by_cluster=FALSE, label_roots = FALSE) +
  coord_fixed() +
  theme_void() +
  labs(title = "") +
  theme(plot.title = element_text(hjust = 0.5, size=20), legend.position="bottom", legend.title=element_text (size=20), legend.text=element_text(size=20)) + 
  guides(colour = guide_colourbar(barwidth = 10, barheight = 2, title = "Pseudotime"))

## view plot
umap_pt

## help was obtained from here
## https://github.com/satijalab/seurat/issues/1658
```

save
```{r}
ggsave("../images_to_export/UMAP_pt_wt.png", plot = umap_pt, device = "png", path = NULL, scale = 1, width = 20, height = 20, units = "cm", dpi = 300, limitsize = TRUE)
```

### gganimnate GIF of pseuodtime

Select appropriate cells
```{r}
plot <- DimPlot(tenx.justwt.integrated, label = FALSE, repel = TRUE, pt.size = 0.1, dims = c(1,2), reduction = "umap")
HoverLocator(plot, information = FetchData(object = tenx.justwt.integrated, vars = 'monocle_sex'))
```

```{r}
cells_to_remove <- c("SC26779_8_280", "CTGGTCTAGCAGATCG")
```

```{r}
#install.packages("gganimate")
library(gganimate)
#install.packages("gifski")
#install.packages("av")
#library(gifski)
#library(av)

## make dataframe for plotting
## extract data for GGplot version of this
df_animation <- as.data.frame(monocle.object.all@int_colData@listData$reducedDims@listData[["UMAP"]])
## add pt to this data frame:
pt_values <- as.data.frame(pseudotime(monocle.object.all, reduction_method = "UMAP"))
df_animation <- merge(df_animation, pt_values, by="row.names") 
rownames(df_animation) <- df_animation$Row.names
colnames(df_animation)[4] <- "pt"

## make the static plot
p <- ggplot(df_animation, aes(x = UMAP_1, y = UMAP_2, colour = pt)) +
  geom_point() +
  scale_colour_viridis_c(option = "plasma") +
  coord_fixed() +
  theme_void() +
  theme(legend.position = "none")
## view plot
plot(p)

## make animated plot
## make a category for animation
#df_animation$group <- cut(df_animation$pt, 15)

anim <- p +
  transition_time(pt) +
  shadow_mark()

animate(anim, height = 3, width = 3, units = "in", res = 150, bg = 'transparent')

## to change the resolution - https://stackoverflow.com/questions/49058567/define-size-for-gif-created-by-gganimate-change-dimension-resolution 
```
Save animation
```{r}
anim_save("animated_UMAP_transparent_bg_wt.gif", path = "../images_to_export/")
```

```{r}
## extract pt values
pt_values <- as.data.frame(pseudotime(monocle.object.all, reduction_method = "UMAP"))

tenx.justwt.integrated <- AddMetaData(tenx.justwt.integrated, pt_values, "old_pt_values")
```

```{r}
library(gganimate)

## make dataframe for plotting
## extract data for GGplot version of this
df_animation <- as.data.frame(tenx.justwt.integrated@reductions[["umap"]]@cell.embeddings)
## add pt to this data frame:
pt_values <- as.data.frame(tenx.justwt.integrated@meta.data[ ,c("old_pt_values", "pt_id_cols"), drop = FALSE])
df_animation <- merge(df_animation, pt_values, by="row.names") 
rownames(df_animation) <- df_animation$Row.names
colnames(df_animation)[4] <- "pt"
## remove outlier cells
df_animation <- df_animation[which(df_animation$Row.names %ni% cells_to_remove), ]

#df_animation <- df_animation[order(df_animation$pt, decreasing = TRUE), ]

## make a category for animation
#df_animation$group <- as.numeric(cut(df_animation$pt, 10, include.lowest = TRUE, labels = c(1:10)))

## make the static plot
p <- ggplot(df_animation, aes(x = UMAP_1, y = UMAP_2)) +
  geom_point(col = df_animation$pt_id_cols) +
  #scale_colour_viridis_c(option = "plasma") +
  coord_fixed() +
  theme_void() +
  theme(legend.position = "none") +
  view_static()
## view plot
plot(p)

## make animated plot


anim <- p +
  transition_time(pt) +
  shadow_mark() +
  #shadow_wake(wake_length = 0.99) #+
  enter_fade()

animate(anim, 
        height = 3, 
        width = 3, 
        units = "in", 
        res = 300, 
        bg = 'white', 
        #duration = 6, 
        fps = 20, 
        device = "png",
        #ref_frame = 1
        end_pause = 10
        )

## save
anim_save("animated_UMAP_highres.gif", path = "../images_to_export/")

## to change the resolution - https://stackoverflow.com/questions/49058567/define-size-for-gif-created-by-gganimate-change-dimension-resolution 
```


#### Define cell identities with branches

Define identities of cells 

male
```{r}
monocle.object_male <- choose_graph_segments(monocle.object.all)
```
female
```{r}
monocle.object_female <- choose_graph_segments(monocle.object.all)
```
bipotential
```{r}
monocle.object_bipot <- choose_graph_segments(monocle.object.all)
```
asexual (pre-branch)
```{r}
monocle.object_asex_pre <- choose_graph_segments(monocle.object.all)
```
asexual fate
```{r}
monocle.object_asex_fate <- choose_graph_segments(monocle.object.all)
```
check
```{r}
df_freq <- data.frame(table(c(colnames(monocle.object_male), colnames(monocle.object_female), colnames(monocle.object_bipot), colnames(monocle.object_asex_pre), colnames(monocle.object_asex_fate))))
paste("number of cells in seurat object is", length(colnames(monocle.object.all)), ". The number of cells selected here with an identitity is", dim(df_freq)[1])
df_freq <- df_freq[df_freq$Freq > 1, ]
df_freq
```
Inspect where these missing cells are:
```{r}
# '%ni%' <- Negate('%in%')
# 
# not_assigned_cells <- colnames(monocle.object.all)[colnames(monocle.object.all) %ni% c(colnames(monocle.object_male), colnames(monocle.object_female), colnames(monocle.object_bipot), colnames(monocle.object_asex), colnames(monocle.object_asex_fate))]
# 
# DimPlot(seurat.object, repel = TRUE, label.size = 5, pt.size = 0.5, cells.highlight = not_assigned_cells, dims = c(2,1), reduction = "DIM_UMAP") +
#   coord_fixed() + 
#   scale_color_manual(values=c("#000000", "#f54e1e"))
```

```{r}
## create annotation dataframe from these results:
df_monocle_sexes <- rbind(data.frame("cell_name" = colnames(monocle.object_male), "sex" = rep("Male", length(colnames(monocle.object_male)))),
                          data.frame("cell_name" = colnames(monocle.object_female), "sex" = rep("Female", length(colnames(monocle.object_female)))),
                          data.frame("cell_name" = colnames(monocle.object_bipot), "sex" = rep("Bipotential", length(colnames(monocle.object_bipot)))),
                          data.frame("cell_name" = colnames(monocle.object_asex_pre), "sex" = rep("Asexual_Early", length(colnames(monocle.object_asex_pre)))),
                          data.frame("cell_name" = colnames(monocle.object_asex_fate), "sex" = rep("Asexual_Late", length(colnames(monocle.object_asex_fate))))
                          #data.frame("cell_name" = not_assigned_cells, "sex" = rep("Unassigned", length(not_assigned_cells)))
                          )

dim(df_monocle_sexes)

## order like the metadata
df_monocle_sexes <- df_monocle_sexes[match(rownames(monocle.object.all@colData), df_monocle_sexes$cell_name), ]

## add this back into the monocle object
monocle.object.all@colData$Sexes_monocle <- df_monocle_sexes$sex

## add this to the seurat object
rownames(df_monocle_sexes) <- df_monocle_sexes$cell_name
df_monocle_sexes_to_add_to_seurat <- df_monocle_sexes[,c("sex"), drop = FALSE]
tenx.justwt.integrated <- AddMetaData(tenx.justwt.integrated, df_monocle_sexes_to_add_to_seurat, col.name = "monocle_sex")
```

pt vs. real time
```{r}
library(ggridges)

df_plotting <- as.data.frame(tenx.justwt.integrated@meta.data)

pt_real_plot <- ggplot(df_plotting, aes(x = old_pt_values, y =  as.factor(Prediction.Spearman.), fill = as.factor(Prediction.Spearman.))) +
  geom_density_ridges(alpha = .1,
                      rel_min_height = 0.03,  
                               quantile_lines = TRUE, 
                               quantiles = 2,
                      scale = 2) + # plot rigdelines without points
  # plot points with invisible ridgelines
  geom_density_ridges_gradient(point_alpha = 1, 
                               alpha = 0,
                               colour = NA,
                               na.rm = FALSE, 
                               jittered_points = TRUE,
                               aes(point_color = monocle_sex)
                               )  +
  theme_classic() +
  scale_y_discrete(drop=FALSE) +
  labs(title = '', x = "Pseudotime", y = "Predicted Real-time (hpi)") +
  #scale_fill_viridis_c(name = "pseudotime", option = "C", alpha = 1) +
  theme(legend.position = "bottom",
        panel.grid.major.y = element_line(size = (0.2), colour="grey")
        )

pt_real_plot
```

```{r}
gg <- ggplot(df_plotting, aes(as.factor(Prediction.Spearman.), fill = monocle_sex)) + geom_bar(stat = "count", position="fill")

gg
```


# 8. Plots {.tabset}

make composite pseudotime/ID figure

```{r}
# 1 = blue - "#0052c5"
# 2 = red - "#a52b1e"
# 3 = green - "#016c00"
# 4 = yellow - "#ffe400"
#pal_sex <- c("#0052c5","#ffe400", "#a52b1e", "#016c00")

## extract pseudotime numbers and identity of cells to a dataframe
df_pt_id <- tenx.justwt.integrated@meta.data[,c("old_pt_values", "monocle_sex")]

## inspect possible values
list_of_sexes <- names(table(df_pt_id$monocle_sex))

## make a new column
df_pt_id$colour <- NA

## make colour ramps
asex_ramp <- colorRampPalette(c("#D5E3F5", "#0052c5"))
male_ramp <- colorRampPalette(c("white", "yellow", "#016c00"))
female_ramp <- colorRampPalette(c("yellow", "#a52b1e"))
bipot_ramp <- colorRampPalette(c("white", "#ffe400"))

## re-classify the cells that are unassigned cells removed from sexual branch above:
#df_pt_id[which(rownames(df_pt_id) %in% remove_cells), ]$monocle_sex <- "Asexual"

## assign values to each cluster
## help here: https://stackoverflow.com/questions/9946630/colour-points-in-a-plot-differently-depending-on-a-vector-of-values 
df_pt_id[df_pt_id$monocle_sex == "Asexual_Early" | df_pt_id$monocle_sex == "Asexual_Late", ]$colour <- asex_ramp(100)[as.numeric(cut(df_pt_id[df_pt_id$monocle_sex == "Asexual_Early" | df_pt_id$monocle_sex == "Asexual_Late", ]$old_pt_values,breaks = 100))]

df_pt_id[df_pt_id$monocle_sex == "Male", ]$colour <- male_ramp(100)[as.numeric(cut(df_pt_id[df_pt_id$monocle_sex == "Male", ]$old_pt_values,breaks = 100))]

df_pt_id[df_pt_id$monocle_sex == "Female", ]$colour <- female_ramp(100)[as.numeric(cut(df_pt_id[df_pt_id$monocle_sex == "Female", ]$old_pt_values,breaks = 100))]

df_pt_id[df_pt_id$monocle_sex == "Bipotential", ]$colour <- bipot_ramp(100)[as.numeric(cut(df_pt_id[df_pt_id$monocle_sex == "Bipotential", ]$old_pt_values,breaks = 100))]

## check everything has a value
#table(is.na(df_pt_id$colour))

## make into a df
#df_pt_id <- df_pt_id[ ,"colour", drop = FALSE]

## add back to seurat object
df <- df_pt_id[ ,"colour", drop = FALSE]
tenx.justwt.integrated <- AddMetaData(tenx.justwt.integrated, df, "pt_id_cols")
rm(df)

## plot
## extract UMAP coords
df_umap_plot <- tenx.justwt.integrated@reductions[["umap"]]@cell.embeddings
df_umap_plot <- merge(df_umap_plot, df_pt_id, by=0, all=TRUE)

## add tree
##The tree for monocle is located here:
# monocle.object@principal_graph_aux[["UMAP"]]$dp_mst 
ica_space_df <- t(monocle.object.all@principal_graph_aux[["UMAP"]]$dp_mst) %>%
      as.data.frame() %>%
      dplyr::select_(prin_graph_dim_1 = "UMAP_1", prin_graph_dim_2 = "UMAP_2") %>%
      dplyr::mutate(sample_name = rownames(.),
                    sample_state = rownames(.))

dp_mst <- monocle.object.all@principal_graph[["UMAP"]]

edge_df <- dp_mst %>%
      igraph::as_data_frame() %>%
      dplyr::select_(source = "from", target = "to") %>%
      dplyr::left_join(ica_space_df %>%
                         dplyr::select_(
                           source="sample_name",
                           source_prin_graph_dim_1="prin_graph_dim_1",
                           source_prin_graph_dim_2="prin_graph_dim_2"),
                       by = "source") %>%
      dplyr::left_join(ica_space_df %>%
                         dplyr::select_(
                           target="sample_name",
                           target_prin_graph_dim_1="prin_graph_dim_1",
                           target_prin_graph_dim_2="prin_graph_dim_2"),
                       by = "target")

## make ggplot
umap_id_pt <- ggplot(df_umap_plot, aes(x = UMAP_1, y = UMAP_2)) + 
                     geom_point(col = df_umap_plot$colour) +
                     theme_void() +
                     coord_fixed() +
                     geom_segment(aes_string(x="source_prin_graph_dim_1",
                                     y="source_prin_graph_dim_2",
                                     xend="target_prin_graph_dim_1",
                                     yend="target_prin_graph_dim_2"),
                          data=edge_df)

umap_id_pt
```

save
```{r}
#ggsave("../images_to_export/umap_id_pt.png", plot = umap_id_pt, device = "png", path = NULL, scale = 1, width = 21, height = 29.7, units = "cm", dpi = 300, limitsize = TRUE)
```
cluster plot
```{r}
### Prepare data

## extract df with cluster ID, 
df_pt_id <- tenx.justwt.integrated@meta.data[ ,"seurat_clusters", drop=FALSE]
## extract UMAP coords
df_plot <- tenx.justwt.integrated@reductions[["umap"]]@cell.embeddings
df_plot <- merge(df_plot, df_pt_id, by=0, all=TRUE)
## add colour col
df_plot$colour <- df_plot$seurat_clusters
levels(df_plot$colour) <- list(print(asex_ramp(17)[1]) = "9", asex_ramp(17)[2]="4", 
                               asex_ramp(17)[3]="15", 
                                asex_ramp(17)[4] = "8", 
                                asex_ramp(17)[5]= "1",
                                asex_ramp(17)[6]=  "14", 
                                asex_ramp(17)[7] = "2", 
                                asex_ramp(17)[8] = "10",
                                asex_ramp(17)[9] = "3",
                                asex_ramp(17)[10] = "0",
                                asex_ramp(17)[11] = "6",
                                asex_ramp(17)[12] = "5",
                                asex_ramp(17)[13] = "7",
                                asex_ramp(17)[14] = "12",
                                asex_ramp(17)[15] = "18",
                                asex_ramp(17)[16] = "20",
                                asex_ramp(17)[17] = "23",
                                bipot_ramp(3)[2] = "11",
                                male_ramp(3)[2] = "16",
                                male_ramp(3)[3] = "13",
                                female_ramp(4)[2] = "21",
                                female_ramp(4)[3] = "22",
                                female_ramp(4)[4] = "19",
                                female_ramp(4)[4] = "17"
                                      )
## aabbreviate clusters
df_plot$cluster_names <- df_plot$seurat_clusters
levels(df_plot$cluster_names) <- list(A_1="9", 
                                      A_2="4", 
                                      A_3="15", 
                                      A_4 = "8", 
                                      A_5 = "1",
                                      A_6=  "14", 
                                      A_7 = "2", 
                                      A_8 = "10",
                                      A_9 = "3",
                                      A_10 = "0",
                                      A_11 = "6",
                                      A_12 = "5",
                                      A_13 = "7",
                                      A_14 = "12",
                                      A_15 = "18",
                                      A_16 = "20",
                                      A_17 = "23",
                                      B = "11",
                                      M_1 = "16",
                                      M_2 = "13",
                                      F_1 = "21",
                                      F_2 = "22",
                                      F_3 = "19",
                                      F_3 = "17"
                                      )

library(dplyr)
dat%>%
group_by(custid)%>% 
summarise(Mean=mean(value), Max=max(value), Min=min(value), Median=median(value), Std=sd(value))



## plot
umap_id <- ggplot(df_plot, aes(x = UMAP_1, y = UMAP_2)) + 
                     geom_point(col = df_plot$colour) +
                     theme_void() +
                     coord_fixed()

umap_id
```

```{r}
## these get defined later on, but are replicated above here for plotting
asexual_early_clusters <- c(9, 4, 15, 8, 1, 14, 2, 10, 3, 0, 6, 5)
asexual_late_clusters <- c(7, 12, 18, 20, 23)
bipotentional_early_clusters <- # 0?
bipotential_clusters <- c(11) 
male_clusters <- c(16, 13)
female_clusters <- c(21, 22, 17, 19)

## make a new column for plotting
tenx.justwt.integrated@meta.data$seurat_clusters_dot_plotting <- tenx.justwt.integrated@meta.data$seurat_clusters

## reorder the levels so you can plot the cluters as you wish
my_levels <- c(asexual_early_clusters,asexual_late_clusters, bipotential_clusters, male_clusters, female_clusters)
tenx.justwt.integrated@meta.data$seurat_clusters_dot_plotting <- factor(x = tenx.justwt.integrated@meta.data$seurat_clusters_dot_plotting, levels = my_levels)

## change the levels
levels(tenx.justwt.integrated@meta.data$seurat_clusters_dot_plotting) <- list(A_1="9", 
                                      A_2="4", 
                                      A_3="15", 
                                      A_4 = "8", 
                                      A_5 = "1",
                                      A_6=  "14", 
                                      A_7 = "2", 
                                      A_8 = "10",
                                      A_9 = "3",
                                      A_10 = "0",
                                      A_11 = "6",
                                      A_12 = "5",
                                      A_13 = "7",
                                      A_14 = "12",
                                      A_15 = "18",
                                      A_16 = "20",
                                      A_17 = "23",
                                      P = "11",
                                      M_1 = "16",
                                      M_2 = "13",
                                      F_1 = "21",
                                      F_2 = "22",
                                      F_3 = "19",
                                      F_3 = "17"
                                      )

### Annotation set-up

## extract pseudotime numbers and identity of cells to a dataframe
df_pt_id <- tenx.justwt.integrated@meta.data[,c("old_pt_values", "monocle_sex", "seurat_clusters_dot_plotting", "Prediction.Spearman.", "Prediction.Spearman._Kasia")]

## make a new column
df_pt_id$colour <- NA

## assign bins to each of the values
## help here: https://stackoverflow.com/questions/9946630/colour-points-in-a-plot-differently-depending-on-a-vector-of-values 
df_pt_id[df_pt_id$monocle_sex == "Asexual_Early" | df_pt_id$monocle_sex == "Asexual_Late", ]$colour <- as.numeric(cut(df_pt_id[df_pt_id$monocle_sex == "Asexual_Early" | df_pt_id$monocle_sex == "Asexual_Late", ]$old_pt_values,breaks = 100))

df_pt_id[df_pt_id$monocle_sex == "Male", ]$colour <- as.numeric(cut(df_pt_id[df_pt_id$monocle_sex == "Male", ]$old_pt_values,breaks = 100))

df_pt_id[df_pt_id$monocle_sex == "Female", ]$colour <- as.numeric(cut(df_pt_id[df_pt_id$monocle_sex == "Female", ]$old_pt_values,breaks = 100))

df_pt_id[df_pt_id$monocle_sex == "Bipotential", ]$colour <- as.numeric(cut(df_pt_id[df_pt_id$monocle_sex == "Bipotential", ]$old_pt_values,breaks = 100))

## make colour ramps
asex_ramp <- colorRampPalette(c("#D5E3F5", "#0052c5"))
male_ramp <- colorRampPalette(c("white", "yellow", "#016c00"))
female_ramp <- colorRampPalette(c("yellow", "#a52b1e"))
bipot_ramp <- colorRampPalette(c("white", "#ffe400"))

## assign values to each cluster
## take the mean of the bin
df_annotation <- aggregate(df_pt_id[, "colour"], list(df_pt_id$seurat_clusters_dot_plotting), mean)
## BECAUSE we have ordered the clusters already, we can simply take the row index for this bit
df_annotation$colour <- NA
df_annotation[1:17, ]$colour <- asex_ramp(100)[df_annotation[1:17, ]$x]
df_annotation[18, ]$colour <- bipot_ramp(100)[df_annotation[18, ]$x]
df_annotation[19:20, ]$colour <- male_ramp(100)[df_annotation[19:20, ]$x]
df_annotation[21:23, ]$colour <- female_ramp(100)[df_annotation[21:23, ]$x]

## make colour pallete
#pal_plot <- c(asex_ramp(170)[1,11,21,31,], bipot_ramp(3)[2], male_ramp(5)[3:4], female_ramp(5)[2:5])
pal_plot <- c(df_annotation$colour, '#49C16DFF')

## plot
umap_with_clusters <- DimPlot(tenx.justwt.integrated,
        group.by = "seurat_clusters_dot_plotting",
        dims = c(1,2), 
        reduction = "umap", 
        pt.size = 1,
        label.box = TRUE,
        label.size = 8,
        label.color	= c(rep(c("#000000"), 12), rep(c("#ffffff"), 5), rep(c("#000000"), 2),"#ffffff", "#000000", "#ffffff", "#ffffff"),
        order = TRUE, 
        repel = TRUE, 
        label = TRUE,
        cols = pal_plot) +
  scale_color_manual(labels = c("Asexual 1", "Asexual 2" , "Asexual 3", "Asexual 4", "Asexual 5", "Asexual 6", "Asexual 7", "Asexual 8",  "Asexual 9", "Asexual 10", "Asexual 11", "Asexual 12", "Asexual 13", "Asexual 14", "Asexual 15", "Asexual 16", "Asexual 17", "Progenitor", "Male 1", "Male 2", "Female 1", "Female 2", "Female 3"), values = pal_plot) +
  theme_void() +
                     coord_fixed()

umap_with_clusters
```

save
```{r}
ggsave("../images_to_export/umap_with_clusters.png", plot = umap_with_clusters, device = "png", path = NULL, scale = 1, width = 21, height = 29.5, units = "cm", dpi = 300, limitsize = TRUE)
```

```{r}
## extract proportion of cells
library(plyr)
df_prop_comb <- as.data.frame(table(tenx.justwt.integrated@meta.data$seurat_clusters_dot_plotting, tenx.justwt.integrated@meta.data$experiment))
names(df_prop_comb) <- c("cluster", "experiment", "Freq")

## calcualte the percentage
df_prop_comb$pc <- NA
df_prop_comb[df_prop_comb$experiment == "mutants", ]$pc <- (df_prop_comb[df_prop_comb$experiment == "mutants", ]$Freq/sum(df_prop_comb[df_prop_comb$experiment == "mutants", ]$Freq))*100
df_prop_comb[df_prop_comb$experiment == "tenx_5k", ]$pc <- (df_prop_comb[df_prop_comb$experiment == "tenx_5k", ]$Freq/sum(df_prop_comb[df_prop_comb$experiment == "tenx_5k", ]$Freq))*100

## reorder levels in Var1 
#df_prop_comb$cluster <- factor(df_prop_comb$cluster, levels = c("unassigned", "M", "F", "5", "7", "4", "3", "1", "0", "2", "6"))

## then reorder by this so the cumsum will work below
df_prop_comb <- df_prop_comb[rev(order(df_prop_comb$experiment, df_prop_comb$cluster)),]

## Calculate the cumulative sum of len for each dose
df_cumsum <- ddply(df_prop_comb, "experiment", transform, label_ypos=cumsum(pc) - 0.5*pc)
head(df_cumsum)
# http://www.sthda.com/english/wiki/ggplot2-barplots-quick-start-guide-r-software-and-data-visualization

## reverse the levels in Var1
#df_cumsum$cluster <- factor(df_cumsum$cluster, levels = rev(c(levels(df_cumsum$cluster))))
#df_cumsum <- df_cumsum[c(match(df_cumsum$cluster[1:23], c(levels(df_cumsum$cluster))), match(df_cumsum$cluster[1:23], c(levels(df_cumsum$cluster))) + 23), ]

library(ggrepel)
library(ggpubr)
## make plot
plot_prop <- ggplot(data=df_cumsum, aes(x=experiment, y=pc, fill=cluster)) +
  geom_bar(stat="identity")+
  geom_label_repel(aes(label = Freq, y=label_ypos, fill = factor(cluster)), color = c(rep(c(rep(c("#ffffff"), 2), "#000000", "#ffffff", rep(c("#000000"), 2), rep(c("#ffffff"), 5), rep(c("#000000"), 12)), 2)), size = 3.5, show.legend = FALSE) +
  #geom_text(aes(y=label_ypos, label=Freq), vjust=0, color="black", size=3.5) +
  labs(fill = "Cell cluster", y= "Cell Proportions (%)", x = "Technology") +
  scale_fill_manual(labels = c("Asexual 1", "Asexual 2" , "Asexual 3", "Asexual 4", "Asexual 5", "Asexual 6", "Asexual 7", "Asexual 8",  "Asexual 9", "Asexual 10", "Asexual 11", "Asexual 12", "Asexual 13", "Asexual 14", "Asexual 15", "Asexual 16", "Asexual 17", "Progenitor", "Male 1", "Male 2", "Female 1", "Female 2", "Female 3"), values = pal_plot) +
  theme_pubr() +
  theme(legend.text=element_text(size=10), legend.position="bottom") +
  scale_x_discrete(labels= rev(c("10x", "Smart-seq2"))) +
  coord_flip() +
  guides(fill=guide_legend(ncol=9))


plot_prop
```
```{r}
ggsave("../images_to_export/WT_cell_type_proportions.png", plot = plot_prop, device = "png", path = NULL, scale = 1, width = 30, height = 9, units = "cm", dpi = 300, limitsize = TRUE)
```

### dotplot

A dotplot allows us to look at the expression of multiple genes in a clearer way than succesive UMAP plots

```{r, fig.width = 9, fig.height= 8}
### Data set-up

# PBANKA-0828000         GCSKO-3  GD1

# PBANKA-1302700       GCSKO-oom  MD1 
# PBANKA-1447900        GCSKO-29  MD2
# PBANKA-0413400    GCSKO-10_820  MD3
# PBANKA-0716500        GCSKO-19  MD4 
# PBANKA-0102400         GCSKO-2  MD5 

# PBANKA-1454800        GCSKO-21  FD1
# PBANKA-0902300        GCSKO-13  FD2
# PBANKA-1418100        GCSKO-17  FD3   
# PBANKA-1435200        GCSKO-20  FD4 

# PBANKA-1437500 - AP2G - commitment
# PBANKA-1319500 - CCP2 - female - used in 820 line
# PBANKA-0416100 - MG1 - dynenin heavy chain - male - used in 820 line
# PBANKA-0831000 - MSP1 - late asexual
# PBANKA-1102200 - MSP8 - early asexual (from Bozdech paper)

marker_genes_list <- c("PBANKA-1437500", "PBANKA-1319500", "PBANKA-0416100", "PBANKA-0831000", "PBANKA-1102200")
mutant_genes_list <- c("PBANKA-0828000", "PBANKA-1302700", "PBANKA-1447900", "PBANKA-0413400", "PBANKA-0716500", "PBANKA-0102400",  "PBANKA-1454800", "PBANKA-0902300", "PBANKA-1418100", "PBANKA-1435200")

## these get defined later on, but are replicated above here for plotting
asexual_early_clusters <- c(9, 4, 15, 8, 1, 14, 2, 10, 3, 0, 6, 5)
asexual_late_clusters <- c(7, 12, 18, 20, 23)
bipotentional_early_clusters <- # 0?
bipotential_clusters <- c(11) 
male_clusters <- c(16, 13)
female_clusters <- c(21, 22, 17, 19)

## copy the clusters so you don't permanently edit the master
tenx.justwt.integrated@meta.data$seurat_clusters_dot_plotting <- tenx.justwt.integrated@meta.data$seurat_clusters

## reorder the levels so you can plot the cluters as you wish
my_levels <- c(asexual_early_clusters,asexual_late_clusters, bipotential_clusters, male_clusters, female_clusters)

## reorder the levels
tenx.justwt.integrated@meta.data$seurat_clusters_dot_plotting <- factor(x = tenx.justwt.integrated@meta.data$seurat_clusters_dot_plotting, levels = my_levels)

## rename clusters so that they are intuitive
# this trick is from here: http://www.cookbook-r.com/Manipulating_data/Renaming_levels_of_a_factor/
levels(tenx.justwt.integrated@meta.data$seurat_clusters_dot_plotting) <- list(Asexual_1="9", 
                                                                              Asexual_2="4", 
                                                                              Asexual_3="15", 
                                                                              Asexual_4 = "8", 
                                                                              Asexual_5 = "1",
                                                                              Asexual_6=  "14", 
                                                                              Asexual_7 = "2", 
                                                                              Asexual_8 = "10",
                                                                              Asexual_9 = "3",
                                                                              Asexual_10 = "0",
                                                                              Asexual_11 = "6",
                                                                              Asexual_12 = "5",
                                                                              Asexual_13 = "7",
                                                                              Asexual_14 = "12",
                                                                              Asexual_15 = "18",
                                                                              Asexual_16 = "20",
                                                                              Asexual_17 = "23",
                                                                              Progenitor = "11",
                                                                              Male_1 = "16",
                                                                              Male_2 = "13",
                                                                              Female_1 = "21",
                                                                              Female_2 = "22",
                                                                              Female_3 = "19",
                                                                              Female_3 = "17"
                                                                              )

### Annotation set-up

## extract pseudotime numbers and identity of cells to a dataframe
df_pt_id <- tenx.justwt.integrated@meta.data[,c("old_pt_values", "monocle_sex", "seurat_clusters_dot_plotting", "Prediction.Spearman.", "Prediction.Spearman._Kasia")]

## make a new column
df_pt_id$colour <- NA

## assign bins to each of the values
## help here: https://stackoverflow.com/questions/9946630/colour-points-in-a-plot-differently-depending-on-a-vector-of-values 
df_pt_id[df_pt_id$monocle_sex == "Asexual_Early" | df_pt_id$monocle_sex == "Asexual_Late", ]$colour <- as.numeric(cut(df_pt_id[df_pt_id$monocle_sex == "Asexual_Early" | df_pt_id$monocle_sex == "Asexual_Late", ]$old_pt_values,breaks = 100))

df_pt_id[df_pt_id$monocle_sex == "Male", ]$colour <- as.numeric(cut(df_pt_id[df_pt_id$monocle_sex == "Male", ]$old_pt_values,breaks = 100))

df_pt_id[df_pt_id$monocle_sex == "Female", ]$colour <- as.numeric(cut(df_pt_id[df_pt_id$monocle_sex == "Female", ]$old_pt_values,breaks = 100))

df_pt_id[df_pt_id$monocle_sex == "Bipotential", ]$colour <- as.numeric(cut(df_pt_id[df_pt_id$monocle_sex == "Bipotential", ]$old_pt_values,breaks = 100))

## make colour ramps
asex_ramp <- colorRampPalette(c("#D5E3F5", "#0052c5"))
male_ramp <- colorRampPalette(c("white", "yellow", "#016c00"))
female_ramp <- colorRampPalette(c("yellow", "#a52b1e"))
bipot_ramp <- colorRampPalette(c("white", "#ffe400"))

## assign values to each cluster
## take the mean of the bin
df_annotation <- aggregate(df_pt_id[, "colour"], list(df_pt_id$seurat_clusters_dot_plotting), mean)
## BECAUSE we have ordered the clusters already, we can simply take the row index for this bit
df_annotation$colour <- NA
df_annotation[1:17, ]$colour <- asex_ramp(100)[df_annotation[1:17, ]$x]
df_annotation[18, ]$colour <- bipot_ramp(100)[df_annotation[18, ]$x]
df_annotation[19:20, ]$colour <- male_ramp(100)[df_annotation[19:20, ]$x]
df_annotation[21:23, ]$colour <- female_ramp(100)[df_annotation[21:23, ]$x]

## plot annotation 
## this really helped: https://www.biostars.org/p/396810/
h2 <- ggplot(df_annotation)+
  geom_bar(mapping = aes(x = Group.1, y = 1), 
           stat = "identity",
           fill = df_annotation$colour,
           col = "#FFFFFF",
           width = 1)+
      theme_void()+
      theme(panel.spacing.x = unit(1, "mm"))#+
     #facet_grid(.~colour, scales = "free_x")
     #legend <- plot_grid(get_legend(h2), get_legend(h1), ncol = 1)
 h2 <- h2 + theme(legend.position = "none")
 #   geom_point(col = df_umap_plot$colour)
 
 h1 <- ggplot(df_annotation)+
  geom_point(mapping = aes(x = Group.1, y = 1),
            col = df_annotation$colour,
            size = 5)+
      theme_void()+
      theme(panel.spacing.x = unit(1, "mm"))#+
     #facet_grid(.~colour, scales = "free_x")
     #legend <- plot_grid(get_legend(h2), get_legend(h1), ncol = 1)
 h1 <- h1 + theme(legend.position = "none")
 #   geom_point(col = df_umap_plot$colour)

 ## add predicted time point
 ## take the mean of the bin
df_annotation <- aggregate(df_pt_id[, "Prediction.Spearman."], list(df_pt_id$seurat_clusters_dot_plotting), mean)
df_annotation$colour <- NA
df_annotation$colour <- viridis(300)[(df_annotation$x)*10]

h3 <- ggplot(df_annotation)+
  geom_bar(mapping = aes(x = Group.1, y = 1), 
           stat = "identity",
           fill = df_annotation$colour,
           col = "#FFFFFF",
           width = 1)+
      theme_void()+
      theme(panel.spacing.x = unit(1, "mm"))#+
     #facet_grid(.~colour, scales = "free_x")
     legend_h3 <- get_legend(h3)
 h3 <- h3 + theme(legend.position = "none")

  h3 <- ggplot(data = df_annotation, 
              mapping = aes(x = Group.1, fill=x)
              ) +
       geom_bar(col = "#FFFFFF", width = 1)+
       theme_void() +
       theme(panel.spacing.x = unit(1, "mm")) +
       scale_fill_viridis(option = 'cividis')
 legend_h3 <- get_legend(h3)
 h3 <- h3 + theme(legend.position = "none")
 
 ## add kasia data
  ## take the mean of the bin
df_annotation <- aggregate(df_pt_id[, "Prediction.Spearman._Kasia"], list(df_pt_id$seurat_clusters_dot_plotting), mean)
df_annotation$colour <- NA
df_annotation$colour <- viridis(300)[(df_annotation$x)*10]

h4 <- ggplot(df_annotation)+
  geom_bar(mapping = aes(x = Group.1, y = 1), 
           stat = "identity",
           fill = df_annotation$colour,
           col = "#FFFFFF",
           width = 1)+
      theme_void()+
      theme(panel.spacing.x = unit(1, "mm"))+
     facet_grid(.~colour, scales = "free_x")
     legend <- plot_grid(get_legend(h4), ncol = 1)
    legend_h4 <- get_legend(h4)
 h4 <- h4 + theme(legend.position = "none")

 h4 <- ggplot(data = df_annotation, 
              mapping = aes(x = Group.1, fill=x)
              ) +
       geom_bar(col = "#FFFFFF", width = 1)+
       theme_void() +
       theme(panel.spacing.x = unit(1, "mm")) +
       scale_fill_viridis()
 legend_h4 <- get_legend(h4)
 h4 <- h4 + theme(legend.position = "none")

### Plot
dot_plot_markers <- DotPlot(tenx.justwt.integrated, 
                            features = c(marker_genes_list, rev(mutant_genes_list)), 
                            group.by = "seurat_clusters_dot_plotting", 
                            dot.min = 0.00001,
                            assay = 'RNA') +
  theme_classic() +
  coord_fixed() +
  coord_flip() +
  # change appearance and remove axis elements, and make room for arrows
  theme(axis.text.x = element_text(size=16, angle = 45, hjust=1,vjust=1, family = "Arial"), 
        axis.text.y = element_text(size=16, face="italic"), 
        text=element_text(size=16, family="Arial", colour="black"), 
        legend.position = "bottom", 
        legend.direction = "horizontal", 
        legend.box = "vertical", 
        plot.title = element_blank(), 
        plot.margin = unit(c(1,3,1,3), "lines"), 
        axis.line = element_blank(),
        panel.border = element_rect(colour = "black", fill=NA, size=0.5)) +
  #change the colours
  #scale_colour_viridis(option = "inferno", guide = "colourbar", na.value="white", begin = 0, end = 1, direction = 1) +
  scale_color_continuous_sequential(palette = "Purples 2") +
  ## change x axis label
  # labs(x = "Marker Genes", y = "Cluster", title = "Expression of Marker Genes by Cluster") 
  labs(x = "", y = "", title = "") +
  ## add arrows
  #annotate("segment", x = 5.5, xend = 5.5, y = 21.5, yend = 25, colour = "green", size=1, alpha=1, arrow=arrow(length=unit(0.30,"cm"), type = "closed")) +
  #annotate("segment", x = 5.5, xend = 5.5, y = 16.5, yend = 21.5, colour = "red", size=1, alpha=1, arrow=arrow(length=unit(0.30,"cm"), type = "closed")) +
  #annotate("segment", x = 5.5, xend = 5.5, y = 0, yend = 15.5, colour = "grey", size=1, alpha=1, arrow=arrow(length=unit(0.30,"cm"), type = "closed")) +
  ## annotate asex
  geom_hline(aes(yintercept = (length(c(asexual_early_clusters, asexual_late_clusters))+0.5)), size = 0.5, linetype= 'dashed') +
  ## annotate bipotential
  geom_hline(aes(yintercept = (length(c(asexual_early_clusters, asexual_late_clusters, bipotential_clusters))+0.5)), size = 0.5, linetype= 'dashed') +
  ## annotate sexes
  geom_hline(aes(yintercept = (length(c(asexual_early_clusters, asexual_late_clusters, bipotential_clusters, male_clusters))+0.5)), size = 0.5, linetype= 'dashed') +
  ## change label on bottom of plot so we can indicate markers
  scale_x_discrete(labels = rev(c("gd1", "md1", "md2", "md3", "md4", "md5", "fd1", "fd2", "fd3", "fd4", "msp8", "msp1", "mg1", "ccp2", "ap2g"))) +
  ## change label on bottom of plot so we can indicate markers
  scale_y_discrete(labels = c("Asexual 1", "Asexual 2" , "Asexual 3", "Asexual 4", "Asexual 5", "Asexual 6", "Asexual 7", "Asexual 8",  "Asexual 9", "Asexual 10", "Asexual 11", "Asexual 12", "Asexual 13", "Asexual 14", "Asexual 15", "Asexual 16", "Asexual 17", "Progenitor", "Male 1", "Male 2", "Female 1", "Female 2", "Female 3")) +
  ## change name of legends
  guides(col=guide_colorbar(title = 'Scaled Average Expression'),
         size=guide_legend("% of cells expressing"))

## view
#print(dot_plot_markers)

plot <- plot_grid(h4, h3, h1, dot_plot_markers, align = "v", ncol = 1, axis = "tb", rel_heights = c(0.5, 0.5, 0.5, 18)) 

dot_plot <- plot_grid(plot, legend_h3, legend_h4, nrow = 1, rel_widths = c(10, 1.5, 1.5))

dot_plot
```

save
```{r}
ggsave("../images_to_export/dot_plot_all.png", plot = dot_plot, device = "png", path = NULL, scale = 1, width = 29.7, height = 21, units = "cm", dpi = 300, limitsize = TRUE)
```

Make individual plots highlighting where cells in each cluster fall
```{r, echo = FALSE, message=FALSE}
## for loop which takes each cluster and makes a list of cells and then plots a highlighted plot and adds it to a list

## make a blank list
list_UMAPs_by_cluster <- vector(mode = "list", length = length(levels(tenx.justwt.integrated@meta.data$seurat_clusters_dot_plotting)))

## for loop
for(i in seq_along(levels(tenx.justwt.integrated@meta.data$seurat_clusters_dot_plotting))){
  ## make a list of cells
  list_of_cells <- rownames(tenx.justwt.integrated@meta.data[which(tenx.justwt.integrated@meta.data$seurat_clusters_dot_plotting == levels(tenx.justwt.integrated@meta.data$seurat_clusters_dot_plotting)[i]), ])
  umap_plot <- DimPlot(tenx.justwt.integrated, label = FALSE, repel = TRUE, pt.size = 0.1, cells.highlight = list_of_cells, dims = c(1,2), reduction = "umap") +
    ## fix coordinates
    coord_fixed() + 
  scale_color_manual(values=c("#D3D3D3", "#1D1564")) + 
  theme_void() + 
  labs(title = paste(levels(tenx.justwt.integrated@meta.data$seurat_clusters_dot_plotting)[i])) + 
  theme(plot.title = element_text(hjust = 0.5), legend.position = "none")
  ## add to the list
  list_UMAPs_by_cluster[[i]] <- umap_plot
}

## check number of clusters
length(list_UMAPs_by_cluster)
```

plot
```{r, fig.height = 10, fig.width = 10}
## this function writes the next bit of code for you
## put it into the console and paste the response
#ploty <- c()
#for(i in seq_along(levels(tenx.justwt.integrated@meta.data$seurat_clusters))){
#  ploty <- paste0(ploty, "list_UMAPs_by_cluster[[", i, "]]", " + ")
#}

## plot
composite_plot <- plot_grid(list_UMAPs_by_cluster[[1]], list_UMAPs_by_cluster[[2]], list_UMAPs_by_cluster[[3]], list_UMAPs_by_cluster[[4]], list_UMAPs_by_cluster[[5]], list_UMAPs_by_cluster[[6]], list_UMAPs_by_cluster[[7]], list_UMAPs_by_cluster[[8]], list_UMAPs_by_cluster[[9]], list_UMAPs_by_cluster[[10]], list_UMAPs_by_cluster[[11]], list_UMAPs_by_cluster[[12]], list_UMAPs_by_cluster[[13]], list_UMAPs_by_cluster[[14]], list_UMAPs_by_cluster[[15]], list_UMAPs_by_cluster[[16]], list_UMAPs_by_cluster[[17]], list_UMAPs_by_cluster[[18]], list_UMAPs_by_cluster[[19]], list_UMAPs_by_cluster[[20]], list_UMAPs_by_cluster[[21]], list_UMAPs_by_cluster[[22]], list_UMAPs_by_cluster[[23]], ncol = 8)

composite_plot
```

save
```{r}
ggsave("../images_to_export/clusters_composite_plot.png", plot = composite_plot, device = "png", path = NULL, scale = 1, width = 21, height = 10, units = "cm", dpi = 300, limitsize = TRUE)
```

# 9. Subset sexual cells {.tabset}

Make a subsetted Seurat object of sexual cells. 

Include the pre-branch too as well as any weird clusters that may have clustered out. 

it's been a while since we looked at the clusters so let's check them out again:
```{r, fig.height = 10, fig.width = 10}
## Plot
DimPlot(tenx.justwt.integrated, label = TRUE, repel = FALSE, pt.size = 0.05, group.by = "seurat_clusters", dims = c(1,2), reduction = "umap") + coord_fixed()

## plot
list_UMAPs_by_cluster[[1]] + list_UMAPs_by_cluster[[2]] + list_UMAPs_by_cluster[[3]] + list_UMAPs_by_cluster[[4]] + list_UMAPs_by_cluster[[5]] + list_UMAPs_by_cluster[[6]] + list_UMAPs_by_cluster[[7]] + list_UMAPs_by_cluster[[8]] + list_UMAPs_by_cluster[[9]] + list_UMAPs_by_cluster[[10]] + list_UMAPs_by_cluster[[11]] + list_UMAPs_by_cluster[[12]] + list_UMAPs_by_cluster[[13]] + list_UMAPs_by_cluster[[14]] + list_UMAPs_by_cluster[[15]] + list_UMAPs_by_cluster[[16]] + list_UMAPs_by_cluster[[17]] + list_UMAPs_by_cluster[[18]] + list_UMAPs_by_cluster[[19]] + list_UMAPs_by_cluster[[20]] + list_UMAPs_by_cluster[[21]] + list_UMAPs_by_cluster[[22]] + list_UMAPs_by_cluster[[23]] + list_UMAPs_by_cluster[[24]]
```

### Define cells and subset
```{r}
asexual_early_clusters <- c(9, 4, 15, 8, 1, 14, 2, 10, 3, 0, 6, 5)
asexual_late_clusters <- c(7, 12, 18, 20, 23)
bipotentional_early_clusters <- # 0?
bipotential_clusters <- c(11) 
male_clusters <- c(16, 13)
female_clusters <- c(21, 22, 17, 19)

## define cells
cell_names_subset_monocle_ids <- rownames(tenx.justwt.integrated@meta.data[tenx.justwt.integrated@meta.data$monocle_sex %in% c("Asexual_Early", "Bipotential", "Male", "Female"), ])

## 3, 0, 6, 5 are early clusters of asexuals before the branch
cell_names_subset_cluster_ids <- rownames(tenx.justwt.integrated@meta.data[tenx.justwt.integrated@meta.data$seurat_clusters %in% c(male_clusters, female_clusters, bipotential_clusters, 3, 0, 6, 5), ])

cell_names_subset_intersect <- intersect(cell_names_subset_monocle_ids, cell_names_subset_cluster_ids)

## subset cells into new object
tenx.justwt.integrated.sex <- subset(tenx.justwt.integrated, cells = cell_names_subset_intersect)
```

### inspect/check
```{r}
## inspect object
tenx.justwt.integrated.sex

## look at original UMAP
DimPlot(tenx.justwt.integrated.sex, label = TRUE, repel = TRUE, pt.size = 0.1, split.by = "experiment", dims = c(1,2), reduction = "umap") + coord_fixed()
```

### Remove contaminant asexual cells

we want to remove:
```{r}
## look at original UMAP
plot_sexual_subsetting <- DimPlot(tenx.justwt.integrated.sex, label = TRUE, repel = TRUE, pt.size = 0.1, dims = c(1,2), reduction = "umap") + 
  coord_fixed() + 
  geom_vline(aes(xintercept = -1, alpha = 5))

plot_sexual_subsetting
```

```{r}
## extract cell embeddings
df_sex_cell_embeddings <- as.data.frame(tenx.justwt.integrated.sex@reductions[["umap"]]@cell.embeddings)

## subset anything lower than -0.8 in UMAP 2 and -0.1 in UMAP 1
remove_cells <- row.names(df_sex_cell_embeddings[which(df_sex_cell_embeddings$UMAP_1 < -1), ])

## plot these cells
DimPlot(tenx.justwt.integrated.sex, label = FALSE, repel = TRUE, pt.size = 0.1, cells.highlight = remove_cells, dims = c(1,2), reduction = "umap") + 
  coord_fixed() + 
  scale_color_manual(values=c("#000000", "#f54e1e")) + 
  theme_void() + 
  labs(title = paste("cells highlighted will be removed")) + 
  theme(plot.title = element_text(hjust = 0.5), legend.position = "none")
```

## Final Subset
```{r}
## make keep cells from the remove_cells
## make the not in function
'%ni%' <- Negate('%in%')
keep_cells <- colnames(tenx.justwt.integrated.sex)[which(colnames(tenx.justwt.integrated.sex) %ni% remove_cells)]

## subset
tenx.justwt.integrated.sex <- subset(tenx.justwt.integrated.sex, cells = keep_cells)

## inspect
tenx.justwt.integrated.sex
```

copy old clusters over
```{r}
## copy old clusters
tenx.justwt.integrated.sex <- AddMetaData(tenx.justwt.integrated.sex, tenx.justwt.integrated.sex@meta.data$seurat_clusters, col.name = "post_integration_clusters")
```

# 10. Sex assignment of mutants 

## A. Seurat Method

```{r}
## find transfer anchors
DefaultAssay(tenx.justwt.integrated) <- "integrated"
merge.anchors <- FindTransferAnchors(reference = tenx.justwt.integrated, query = GCSKO_mutants, 
    dims = 1:30)

## transfer data between ref and query
predictions <- TransferData(anchorset = merge.anchors, refdata = tenx.justwt.integrated$seurat_clusters_dot_plotting, 
    dims = 1:30)

## add meta data to object
GCSKO_mutants <- AddMetaData(GCSKO_mutants, metadata = predictions)

## new object from this
mutant_seurat <- GCSKO_mutants

## look at breakdown of mutant by designation
df <- as.data.frame(mutant_seurat@meta.data)
table(df$predicted.id, df$identity_name_updated)
```

```{r}
## see how this overlaps with fluorescence sorted on for 820 to confirm accuracy
df_820 <- df[df$genetic_background =="PBANKA_820", ]
table(df_820$predicted.id, df_820$fluoresence_sorted_on)
```

inspect gd1
```{r}
df[which(df$predicted.id == "Female_1" & df$identity_name_updated == "gd1"), ][, 118:146]
```


Calculate sex ratios
```{r}
## use designations above for sexes
male_cells <- rownames(mutant_seurat@meta.data[mutant_seurat@meta.data$predicted.id == "Male", ])
female_cells <- rownames(mutant_seurat@meta.data[mutant_seurat@meta.data$predicted.id == "Female", ])
ss2_mutants_final_male <- subset(mutant_seurat, cells = male_cells)
ss2_mutants_final_female <- subset(mutant_seurat, cells = female_cells)

## inspect
ss2_mutants_final_male
ss2_mutants_final_female
```

```{r}
## calculate sex ratios
##subset out H, sorted cells:
df_male <- ss2_mutants_final_male@meta.data[ss2_mutants_final_male@meta.data$exclude_for_sex_ratio == FALSE,]

dim(df_male)

df_female <- ss2_mutants_final_female@meta.data[ss2_mutants_final_female@meta.data$exclude_for_sex_ratio == FALSE,]

dim(df_female)

## make dataframe
df_sex_ratio <- merge(
  as.data.frame(table(df_male$sub_name_updated)), 
  as.data.frame(table(df_female$sub_name_updated)), 
  by = "Var1", all=TRUE)

# or use identity_updated

## add names
names(df_sex_ratio) <- c("genotype", "male", "female")

## change the NAs to 0
df_sex_ratio[is.na(df_sex_ratio)] <- 0

## collapse 820 wild-types together
combined_m <- df_sex_ratio[df_sex_ratio$genotype == "WT-820_3_5", ]$male + df_sex_ratio[df_sex_ratio$genotype == "WT-820", ]$male
combined_f <- df_sex_ratio[df_sex_ratio$genotype == "WT-820_3_5", ]$female + df_sex_ratio[df_sex_ratio$genotype == "WT-820", ]$female
df_sex_ratio <- rbind(df_sex_ratio, c("WT-820-combined", combined_m, combined_f))
# remove old rows
df_sex_ratio <- df_sex_ratio[-which(df_sex_ratio$genotype == "WT-820_3_5" | df_sex_ratio$genotype == "WT-820"), ]
# need to make numeric again
df_sex_ratio$male <- as.numeric(df_sex_ratio$male)
df_sex_ratio$female <- as.numeric(df_sex_ratio$female)
df_sex_ratio$genotype <- as.character(df_sex_ratio$genotype)
# add name for WT combined
df_sex_ratio$genotype[17] <- "WT-820-combined"

## calculate sex ratio
df_sex_ratio$sex_ratio <- (df_sex_ratio$male + 0.1)/(df_sex_ratio$female + 0.1)

## log sex ratio
df_sex_ratio$sex_ratio_log <- log10(df_sex_ratio$sex_ratio)

##view
df_sex_ratio
```

```{r}
## remove WT-2 because it is really inappropriate to have a sex ratio for this:
df_sex_ratio <- df_sex_ratio[-which(df_sex_ratio$genotype == "WT-md5"),]
```

plot
```{r, fig.height = 7, fig.width = 6}
library(latex2exp) # so you can plot the fraction, it's on CRAN `install.packages("latex2exp")`

## make extra column for plotting aesthetics:
df_sex_ratio$above <- df_sex_ratio$sex_ratio_log > 0

## make extra column with ratio in it:
df_sex_ratio$genotype_with_n <- paste0(df_sex_ratio$genotype, " (", df_sex_ratio$male, "/", df_sex_ratio$female, ")")

## reorder genotype so it is in the correct order for plotting
df_sex_ratio$genotype_with_n <- factor(df_sex_ratio$genotype_with_n, levels = df_sex_ratio$genotype_with_n[order(df_sex_ratio$sex_ratio_log)])

## plot
sex_ratio_plot <- ggplot(df_sex_ratio, aes(sex_ratio_log, genotype_with_n, color = above)) +
      ## add the lines for the lollipop plot
      geom_segment(aes(x = 0, y = genotype_with_n, xend = sex_ratio_log, yend = genotype_with_n), color = "grey50") +
      ## add the points for the lollipop plot
      geom_point(aes(size = 4)) +
      ## add the wild-type rectangle
      annotate("rect", xmin= -0.23182478, xmax = 1.00105797, ymin=-Inf , ymax=Inf, alpha=0.4, color=NA,linetype = 2, fill="#999999") +
      ## make prettier
      theme_classic() +
      theme(legend.position = "none", text=element_text(size=16, family="Arial")) + 
      ## change axis labels
      labs(y = TeX("$Genotype\ \\left(\\frac{n_{male}}{n_{female}}\\right)$")) +
      #labs(y = expression(paste("Genotype", group("(", frac(paste("n male"), "n female"), ")")))) +
      xlab(expression(paste("Sex Ratio (", log[10], 
                               group("(",
                                      frac(paste("n male + 0.1"), 
                                           paste("n female + 0.1")),
                                   ")"), ")" ))) +
      ## change colours of lollipops
      scale_colour_manual(values = c("#a52b1e", "#016c00")) +
      ## annotate phenotypes
      geom_hline(aes(yintercept = 5.5)) #+
      #geom_hline(aes(yintercept = 14.5))

print(sex_ratio_plot)

### CHANGE TO BIG BRACKET HERE: https://www.overleaf.com/learn/latex/Brackets_and_Parentheses 

#paste("Sex Ratio", "\n", "log10((n male + 0.1)/(n female + 0.1))")
#theme(legend.position = "none", text=element_text(size=16, family="Arial"))

## fraction titles: https://groups.google.com/forum/#!topic/ggplot2/bgNRnZ82hJY 
## https://uc-r.github.io/lollipop

#ggsave(filename = "../images_to_export/sex_ratio_plot.png", device = "png", width = 7, height = 7, units = "in")
```

## B. SCMAP Method

### Build the index

```{r}
### Making an ortholog reference index

## load in mca data
#counts <- read.csv("../scmap/allpb10x_counts.csv", row.names = 1)
#pheno <- read.csv("../scmap/allpb10x_pheno.csv")
#ggplot(pheno, aes(x=PC2_3d, y = PC3_3d,colour=absclust3)) + geom_point()

## load required libraries
library(scmap) #https://bioconductor.org/packages/release/bioc/html/scmap.html 
library(SingleCellExperiment) #

#prep the SCE, if was originally a Suerat object need the dfs to be regular matrices
#pb_filtered_sce_orth <- pb_filtered_sce_orth[, colData(pb_filtered_sce_orth)$absclust3 != "8"]
#sce <- pb_filtered_sce_orth
#pca <- plotPCA(sce)
#pcs <- pca$data
#table(rownames(pcs)==colnames(sce))
#colData(sce) <- cbind(colData(sce), pcs)
#rowData(sce)$feature_symbol <- rowData(sce)$gene

## extract data from Seurat
#cells_tenx <- rownames(tenx.justwt.integrated@meta.data[which(tenx.justwt.integrated@meta.data$experiment == "tenx_5k"), ])
#tenx.justwt.integrated.10k <- subset(tenx.justwt.integrated, cells = cells_tenx)
counts = as.matrix(GetAssayData(tenx.justwt.integrated, slot = "counts", assay = "RNA"))
pheno = as.data.frame(tenx.justwt.integrated@meta.data)

## add UMAP coordinates
df_sex_cell_embeddings <- as.data.frame(tenx.justwt.integrated@reductions[["umap"]]@cell.embeddings)
pheno <- cbind(pheno, df_sex_cell_embeddings)

## Set up object
sce <- SingleCellExperiment(list(counts=counts),
    colData=DataFrame(label=pheno),
    rowData=DataFrame(feature_symbol=rownames(counts)))
sce

## manual logging
#counts_1 <- assay(sce, "counts")
#libsizes <- colSums(counts_1)
#size.factors <- libsizes/mean(libsizes)
#logcounts(sce) <- log2(t(t(counts_1)/size.factors) + 1)
#counts(sce) <- as.matrix(counts(sce))
#logcounts(sce) <- as.matrix(logcounts(sce))
logcounts(sce) <- as.matrix(GetAssayData(tenx.justwt.integrated, slot = "data", assay = "RNA"))

## remove features with duplicated names
sce <- sce[!duplicated(rownames(sce)), ]

## build scmap-cell reference index, save this rds
sce <- selectFeatures(sce, suppress_plot = FALSE, n_features = 2000)
table(rowData(sce)$scmap_features)
set.seed(1)
sce <- indexCell(sce, M = 50, k = 80)
names(metadata(sce)$scmap_cell_index)
#length(metadata(sce)$scmap_cell_index$subcentroids)
#dim(metadata(sce)$scmap_cell_index$subcentroids[[1]])
#metadata(sce)$scmap_cell_index$subcentroids[[1]][,1:5]
#dim(metadata(sce)$scmap_cell_index$subclusters)

#saveRDS(pb_filtered_sce_orth, file="pb_filtered_sce_orthindex_20181109.rds")
```

### Map to the index

```{r}
#rowData(pb_filtered_sce_orth)$feature_symbol <- rowData(pb_filtered_sce_orth)$orth_name
#rownames(pb_filtered_sce_orth) <- rowData(pb_filtered_sce_orth)$orth_name

#prep the SCE, if was originally a Suerat object need the dfs to be regular matrices
#pb_filtered_sce_orth <- pb_filtered_sce_orth[, colData(pb_filtered_sce_orth)$absclust3 != "8"]
#sce <- pb_filtered_sce_orth
#pca <- plotPCA(sce)
#pcs <- pca$data
#table(rownames(pcs)==colnames(sce))

#colData(sce) <- cbind(colData(sce), pcs)

mutants_counts = as.matrix(GetAssayData(mutant_seurat, slot = "counts", assay = "RNA"))
mutants_pheno = as.data.frame(mutant_seurat@meta.data)

#rowData(sce)$feature_symbol <- rowData(sce)$gene
mutants.sce <- SingleCellExperiment(list(counts=mutants_counts),
    colData=DataFrame(label=mutants_pheno),
    rowData=DataFrame(feature_symbol=rownames(mutants_counts)))
mutants.sce

## add log counts
## manual
#counts_1 <- assay(mutants.sce, "counts")
#libsizes <- colSums(counts_1)
#size.factors <- libsizes/mean(libsizes)
#logcounts(mutants.sce) <- log2(t(t(counts_1)/size.factors) + 1)
## from seurat
logcounts(mutants.sce) <- as.matrix(GetAssayData(mutant_seurat, slot = "data", assay = "RNA"))

#Project query data set onto cell index
scmapCell_results <- scmapCell(
  mutants.sce, 
  list(wt = metadata(sce)$scmap_cell_index
  )
)

##Look into the results
# For each dataset there are two matricies. cells matrix contains the top 10 (scmap default) cell IDs of the cells of the reference dataset that a given cell of the projection dataset is closest to:
#   
#   Give assignments in two ways:
#   1. Take the top cell assignment abs clust, if cosine similarity is less than 0.4 (or adjust if needed) mark as unassigned
# 2. For the top 3 nearest neighbors, get a mean of the PCA coordinates and snap to the nearest cell of those coordinates. If any of the top three cells are sim below 0.4 then mark as unassigned.


##Top cell assignment method

## look at assignments
scmapCell_results$wt$cells[, 1:3]
## get cell indexes 
getcells <- scmapCell_results$wt$cells[1, ]
## exctract cells metadata from index
cdsce <- colData(sce)[getcells, ]
## extract similarities
topsim <- scmapCell_results$wt$similarities[1, ]
## get name of top cell matched
mutants.sce$topcell <- rownames(cdsce)
## get sex id
mutants.sce$topcell_ac <- cdsce$label.monocle_sex
## get coordinates of matched cell
mutants.sce$indexPC1 <- cdsce$label.UMAP_1
mutants.sce$indexPC2 <- cdsce$label.UMAP_2
#mutants.sce$pbpt <- cdsce$pseudotime
#mutants.sce$pbbulk <- cdsce$bulk
mutants.sce$topcell_sp <- mutants.sce$topcell_ac
mutants.sce$topsim <- topsim
mutants.sce$topcell_sp[mutants.sce$topsim < 0.4] <- "unassigned"
table(mutants.sce$topcell_sp)
```

```{r}
## see how this overlaps with fluorescence sorted on for 820 to confirm accuracy
df_820 <- colData(mutants.sce)[colData(mutants.sce)$label.genetic_background =="PBANKA_820", ]
table(df_820$topcell_sp, df_820$label.fluoresence_sorted_on)
```

```{r}
table(colData(mutants.sce)$topcell_sp, colData(mutants.sce)$label.identity_name_updated)
```


```{r}
#### TOP 3NN method

#This function makes a list of the PC means for each cell and then do.call below rbinds them into a dataframe called big_data

datalist = list()

for (i in colnames(scmapCell_results$wt$cells)) {
  
  getcellstest <- scmapCell_results$wt$cells[1:3, i]
  cdscetest <- colData(sce)[getcellstest, ]
  PC1mean <- mean(cdscetest$label.UMAP_1)
  PC2mean <- mean(cdscetest$label.UMAP_2)
  # ... make some data
  dat <- data.frame(i, PC1mean, PC2mean)
  dat$i <- i  # maybe you want to keep track of which iteration produced it?
  datalist[[i]] <- dat # add it to your list
}

big_data = do.call(rbind, datalist)
# or big_data <- dplyr::bind_rows(datalist)
# or big_data <- data.table::rbindlist(datalist)

test <- big_data[1, ]

df <- data.frame(X=colData(sce)$label.UMAP_1, Y=colData(sce)$label.UMAP_2, row.names = rownames(colData(sce)))

#the snap function snaps to the nearest cell in PC coordiantes
snap <- function(df, test){
  require(Biobase)
  d <- matchpt(as.matrix(df),
               as.matrix(data.frame(X=test$PC1mean,Y=test$PC2mean)))
  
  min_row <- rownames(d[d$distance==min(d$distance),])
  
  test$X_snap <- unique(df[min_row,"X"])
  test$Y_snap <- unique(df[min_row,"Y"])
  test$pb_cell <- min_row
  
  test
}

#this loops through each cell and in big_data and runs the snap function
datalist2 = list()
colnames(big_data) <- c("sample_id", "PC1mean", "PC2mean")
for (i in rownames(big_data)) {
  test <- big_data[i, ]
  coord <- snap(df, test)
  coord$i <- i
  datalist2[[i]] <- coord
}
big_data2 = do.call(rbind, datalist2)


table(rownames(big_data2)==rownames(colData(mutants.sce)))

allpbcd <- colData(sce)
allpbcd <- as.data.frame(allpbcd)
pbabsclust <- allpbcd[, c("label.pt_id_cols", "label.identity_name_updated", "label.monocle_sex"), drop=FALSE]
pbabsclust$pb_sample_id <- rownames(pbabsclust)

# Now merge the pc cell asignments with their abs clust and get in the right order
big_data3 <- merge(big_data2, pbabsclust, by.x = "pb_cell", by.y = "pb_sample_id", all.x=TRUE, all.y=FALSE)
big_data4 <- big_data3[match(rownames(big_data2), big_data3$sample_id), ]


colors <- c("6"="#78C679",
            "2"="#D1EC9F",
            "0"="#FEB24C",
            "1"="#F4CF63",
            "3"="#FEEEAA",
            "4"="#85B1D3",
            "7"="#9ecae1",
            "5"="#C9E8F1",
            "M"= "#B7B7D8",
            "F"="#9C96C6",
            "unassigned"="black")

ggplot(big_data4, aes(PC1mean, PC2mean)) +
  geom_point(col = big_data4$label.pt_id_cols) +
                     theme_void() +
                     coord_fixed()

#+ geom_point(aes(colour=factor(big_data4$label.label.pt_id_cols))) + scale_color_manual(values = colors) 

                     

ggplot(big_data4, aes(X_snap, Y_snap)) +
  geom_point(col = big_data4$label.pt_id_cols) +
                     theme_void() +
                     coord_fixed() 

#+ geom_point(aes(colour=factor(big_data4$label.label.pt_id_cols))) + scale_color_manual(values = colors) 

##add info to SCE and save colData, to be an assigned cell all 3NN must have a cos sim >0.4
scmapCell_results$wt$similarities[, 1:3]
topsim1 <- scmapCell_results$wt$similarities[1, ]
topsim2 <- scmapCell_results$wt$similarities[2, ]
topsim3 <- scmapCell_results$wt$similarities[3, ]

table(big_data4$sample_id==rownames(colData(mutants.sce)))
#pfobj$pb_cell <- big_data4$pb_cell
#pfobj$PC1mean <- big_data4$PC1mean
bd4 <- big_data4[, c("pb_cell", "sample_id", "PC1mean", "PC2mean", "X_snap", "Y_snap", "label.pt_id_cols", "label.identity_name_updated", "label.monocle_sex")]

colData(mutants.sce) <- cbind(colData(mutants.sce), bd4)

mutants.sce$topsim1 <- topsim1
mutants.sce$topsim2 <- topsim2
mutants.sce$topsim3 <- topsim3
mutants.sce$stage_pred <- big_data4$label.monocle_sex
mutants.sce$stage_pred[mutants.sce$topsim1 < 0.4 | mutants.sce$topsim2 < 0.4 | mutants.sce$topsim3 < 0.4] <- "unassigned"
table(mutants.sce$stage_pred)

#write.csv(bd4, "pfcellassignmentswithmean3nn_20181029.csv")
#write.csv(colData(pfobj), "pf3d7100scmapclusts2methodindexn100_20190107.csv")
```

```{r}
## see how this overlaps with fluorescence sorted on for 820 to confirm accuracy
df_820 <- colData(mutants.sce)[colData(mutants.sce)$label.genetic_background =="PBANKA_820", ]
table(df_820$stage_pred, df_820$label.fluoresence_sorted_on)
```

```{r}
table(colData(mutants.sce)$stage_pred, colData(mutants.sce)$label.identity_name_updated)
```


## make final plot with MCA data below
```{r}
## Read in MCA data
pheno <- read.csv("../scmap/allpb10x_pheno.csv")
## extract rows needed for plotting
df_plot <- pheno[,c("PC2_3d", "PC3_3d", "absclust3")]
df_plot$experiment <- df_plot$absclust3

## extract rows needed for plotting from mapped object
df_plot_pm <- as.data.frame(colData(pm_ss2_field.sce.orth))
df_plot_pm <- df_plot_pm[ ,c("X_snap", "Y_snap", "label.absclust3")]
df_plot_pm$experiment <- "pm"
colnames(df_plot_pm) <- c("PC2_3d", "PC3_3d", "absclust3", "experiment")

## extract rows needed for plotting from mapped object
df_plot_pf <- as.data.frame(colData(pf_ss2_field.sce.orth))
df_plot_pf <- df_plot_pf[ ,c("X_snap", "Y_snap", "label.absclust3")]
df_plot_pf$experiment <- "pf"
colnames(df_plot_pf) <- c("PC2_3d", "PC3_3d", "absclust3", "experiment")

## bind together
df_plot <- rbind(df_plot, df_plot_pf, df_plot_pm)

colors <- c("6"="#78C679",
            "2"="#D1EC9F",
            "0"="#FEB24C",
            "1"="#F4CF63",
            "3"="#FEEEAA",
            "4"="#85B1D3",
            "7"="#9ecae1",
            "5"="#C9E8F1",
            "M"= "#B7B7D8",
            "F"="#9C96C6",
            "unassigned"="black",
            "pm" = "#dc65a4",
            "pf" = "#24245f")

## add alpha for plotting
df_plot$alpha[!df_plot$experiment == "pm"] <- 0.5
df_plot$alpha[df_plot$experiment == "pm"] <- 1
df_plot$alpha[df_plot$experiment == "pf"] <- 1

##plot
scmap_pca <- ggplot(df_plot, aes(x=PC2_3d, y = PC3_3d,colour=experiment, alpha = alpha)) + 
  geom_point() +
  theme_void() +
  scale_color_manual(values = colors) +
  ## remove alpha scale
  scale_alpha_continuous(guide=FALSE)
  ## draw ring around field samples
  #geom_point(data=df_plot[df_plot$experiment == "pm", ], pch=21, fill=NA, size=2, colour="black", stroke=1) +
  ## make non-field samples more opaque
  #geom_point(data=df_plot[-df_plot$experiment == "pm", ], alpha = 0.2)

## view
scmap_pca_2 <- scmap_pca + 
  guides(colour=guide_legend(override.aes = list(size=4)))

scmap_pca_2
```

save
```{r}
#ggsave("../images_to_export/field_samples_scmap.png", plot = scmap_pca_2, device = "png", path = NULL, scale = 1, width = 15, height = 10, units = "cm", dpi = 300, limitsize = TRUE)
```

## Overlap and validation

```{r}

```

```{r}

```

# 11. Save and Export {.tabset}

## save

Save environment
```{r}
## This saves everything in the global environment for easy recall later
#save.image(file = "GCSKO_merge.RData")
#load(file = "GCSKO_merge.RData")
```

Save object(s)
```{r}
## Save an object to a file
saveRDS(tenx.justwt.integrated.sex, file = "../data_to_export/tenx.justwt.integrated.sex.RDS")
## Restore the object
#readRDS(file = "../data_to_export/tenx.mutant.integrated.sex.RDS")

## save integrated object to file
saveRDS(tenx.justwt.integrated, file = "../data_to_export/tenx.justwt.integrated.RDS") 
## restore the object
#tenx.justwt.integrated <- readRDS("../data_to_export/tenx.justwt.integrated.RDS")

## save monocle object
saveRDS(monocle.object.all, file = "../data_to_export/monocle.object.all.RDS") 
## restore the object
#monocle.object.all <- readRDS("../data_to_export/monocle.object.all.RDS")

## save integrated object to file
#saveRDS(GCSKO_mutants, file = "../data_to_export/GCSKO_mutants.RDS") 
## restore the object
#tenx.justwt.integrated <- readRDS("../data_to_export/tenx.justwt.integrated.RDS")

## extract UMAP coordinates and metadata
## extract metadata
mca_website_df <- tenx.justwt.integrated@meta.data

## add umap - merge on row names and then make sure row names are not dropped or added
mca_website_df <- transform(merge(mca_website_df, tenx.justwt.integrated@reductions[["umap"]]@cell.embeddings, by=0, all=TRUE), row.names=Row.names, Row.names=NULL)

## send this to sunil:
# ggplot(mca_website_df, aes(x = UMAP_1, y = UMAP_2)) +
# geom_point(col = mca_website_df$pt_id_cols) +
# theme_void() +
# coord_fixed()

## export
write.csv(mca_website_df, file = "../data_to_export/mca_website_df.csv")
```

Clean up
```{r}
#rm(ss2_wt_cells)
#rm(tenx.justwt.integrated)
#rm(tenx.justwt.list)
```

## final figure construction

[Cowplot](https://wilkelab.org/cowplot/articles/plot_grid.html)(plot_grid), [patchwork](https://cran.r-project.org/web/packages/patchwork/vignettes/patchwork.html)(wrap_plots), and [ggpubr](http://www.sthda.com/english/articles/24-ggpubr-publication-ready-plots/81-ggplot2-easy-way-to-mix-multiple-graphs-on-the-same-page/) can all allow multiple plots to be plotted together. 

```{r, fig.height = 11.7, fig.width = 8.3}
## A
# umap_id_pt
## B
# marker gene expression
## C
# Mutant gene expression
## D
# Modules

#Figure_A <- grid.arrange(arrangeGrob(QC_composite_plot, QC_mito_violin, QC_mito_graph, QC_by_genotype, mapping_rate_plot), nrow=3), nrow=2, heights=c(10,2))

## cowplot method
## can use this for labels: toupper(letters)[1:10]

## C. Mutant genes
mutant_genes_figure <- plot_grid(
                                 ## marker genes starts
                                 marker_gene_plot_FAMB,
                                 marker_gene_plot_MSP8,
                                 marker_gene_plot_MSP1,
                                 marker_gene_plot_AP2G,
                                 marker_gene_plot_CCP2,
                                 marker_gene_plot_MG1,
                                 ## mutant genes starts  
                                 marker_gene_plot_gd1,
                                 marker_gene_plot_md1,
                                 marker_gene_plot_md2,
                                 marker_gene_plot_md3,
                                 marker_gene_plot_md4, 
                                 marker_gene_plot_md5,
                                 marker_gene_plot_fd1,
                                 marker_gene_plot_fd2,
                                 marker_gene_plot_fd3,
                                 marker_gene_plot_fd4, 
  label_size = 1, 
  nrow = 4)

## labels as uppercase letters:
#c(toupper(letters)[2:17])
# labels as roman numerals:
# c(tolower(as.roman(c(2:17))

Figure_publication <- plot_grid(umap_id_pt + theme(plot.margin = unit(c(0, 0, 0, 0), "cm")), 
                                mutant_genes_figure,
                                ## add empty plot to give spacing
                                ggplot() + theme_void(),
                                labels = c('A', 'B'), 
                                label_size = 12, 
                                ncol = 2, 
                                nrow=2, 
                                rel_heights = c(1, 1, 4), 
                                rel_widths = c(1, 2, 3))

Figure_publication
```

save
```{r}
ggsave("../images_to_export/Figure_C.png", plot = Figure_publication, device = "png", path = NULL, scale = 1, width = 21, height = 29.7, units = "cm", dpi = 300, limitsize = TRUE)
```

```{r, fig.height = 11.7, fig.width = 8.3}
layout <- "
AAABBB
CCDDEE
FFGGHH
IIJJKK
"

Figure_publication_A <- (composition_umap_10x + theme(plot.title = element_text(hjust = 0.5, family="Arial", size = 10))) +
(composition_umap_ss2 + theme(plot.title = element_text(hjust = 0.5, family="Arial", size = 10))) + 
plot_layout(ncol = 2)

Figure_publication <- Figure_publication_A +
#(UMAP_hoo + theme(plot.title = element_text(hjust = 0.5, family="Arial", size = 10)) + labs(title="Predicted Asexual Cycle Real Timepoint")) +
#(UMAP_kasia + theme(plot.title = element_text(hjust = 0.5, family="Arial", size = 10)) + labs(title="Predicted Gametocytogenesis Real Timepoint")) +
  ## marker genes
marker_gene_plot_FAMB +
marker_gene_plot_MSP8 +
marker_gene_plot_MSP1 +
marker_gene_plot_CDPK5 +
marker_gene_plot_AP2G +
marker_gene_plot_CCP2 +
marker_gene_plot_MG1 +
marker_gene_plot_ORC1 +
marker_gene_plot_MCM4 +
  ## mutant genes starts  
#marker_gene_plot_gd1 +
#marker_gene_plot_md1 +
#marker_gene_plot_md2 +
#marker_gene_plot_md3 +
#marker_gene_plot_md4 + 
#marker_gene_plot_md5 +
#marker_gene_plot_fd1 +
#marker_gene_plot_fd2 +
#marker_gene_plot_fd3 +
#marker_gene_plot_fd4 + 
plot_layout(ncol = 6, 
            widths = c(1, 1, 1, 1, 1 ,1),
            heights = c(2, 1, 1, 1),
            design = layout
            )

Figure_publication
```

save
```{r}
ggsave("../images_to_export/Figure_S2.png", plot = Figure_publication, device = "png", path = NULL, scale = 1, width = 21, height = 29.7, units = "cm", dpi = 300, limitsize = TRUE)
```

# Appendix {.tabset}

## Session Info 
```{r, echo = FALSE}
sessionInfo()
```

# Expression plots

#### Expression - raw - Viridis
```{r, fig.height = 15, fig.width = 15}
## find a good ring marker, to see if there is a better one than the ones reported
#markers_ring <- FindMarkers(tenx.justwt.integrated, ident.1 = c("4", "5", "16", "11", "7", "3", "9", "0", "22"))
#head(markers_ring)

# PBANKA-1319500 - CCP2 - female - used in 820 line
# PBANKA-0416100 - MG1 - dynenin heavy chain - male - used in 820 line
# PBANKA-1437500 - AP2G - commitment
# PBANKA-0831000 - MSP1 - late asexual
# PBANKA-1102200 - MSP8 - early asexual (from Bozdech paper)
# PBANKA-0711900 - HSP70 - promoter used for GFP and RFP expression in the mutants
# PBANKA-1400400 - FAMB - ring marker - discovered by looking for marker genes in data
# PBANKA-0722600 - Fam-b2 - ring marker - https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5113031/ 


marker_gene_plot_CCP2 <- FeaturePlot(tenx.justwt.integrated, features = "PBANKA-1319500", coord.fixed = TRUE, 
                                     #min.cutoff = "q1", 
                                     dims = c(1,2), reduction = "umap", pt.size = 1, order = TRUE) + 
  theme_void() + 
  labs(title = paste(italic(ccp2), "(Female)")) + 
  theme(plot.title = element_text(hjust = 0.5, family="Arial", size = 20, face = "bold")) + 
  scale_colour_gradientn(colours=c("#DCDCDC", plasma(30)))

marker_gene_plot_MG1 <- FeaturePlot(tenx.justwt.integrated, features = "PBANKA-0416100", coord.fixed = TRUE, 
                                    #min.cutoff = "q1", 
                                    dims = c(1,2), reduction = "umap", pt.size = 1, order = TRUE) + 
  theme_void() + 
  labs(title = paste(italic(mg1), "(Male)")) + 
  theme(plot.title = element_text(hjust = 0.5, family="Arial", size = 20, face = "bold")) + 
  scale_colour_gradientn(colours=c("#DCDCDC", plasma(30)))

marker_gene_plot_AP2G <- FeaturePlot(tenx.justwt.integrated, features = "PBANKA-1437500", coord.fixed = TRUE, 
                                     #min.cutoff = "q1", 
                                     dims = c(1,2), reduction = "umap", pt.size = 1, order = TRUE) + 
  theme_void() + 
  labs(title = paste(italic(ap2g), "(Commitment)")) + 
  theme(plot.title = element_text(hjust = 0.5, family="Arial", size = 20, face = "bold")) + 
  scale_colour_gradientn(colours=c("#DCDCDC", plasma(30)))

marker_gene_plot_MSP1 <- FeaturePlot(tenx.justwt.integrated, features = "PBANKA-0831000", coord.fixed = TRUE, 
                                     #min.cutoff = "q1", 
                                     dims = c(1,2), reduction = "umap", pt.size = 1, order = TRUE) + 
  theme_void() + 
  labs(title = paste(italic(msp1), "(Schizont)")) + 
  theme(plot.title = element_text(hjust = 0.5, family="Arial", size = 20, face = "bold")) + 
  scale_colour_gradientn(colours=c("#DCDCDC", plasma(30)))

marker_gene_plot_MSP8 <- FeaturePlot(tenx.justwt.integrated, features = "PBANKA-1102200", coord.fixed = TRUE, 
                                     #min.cutoff = "q1", 
                                     dims = c(1,2), reduction = "umap", pt.size = 1, order = TRUE) + 
  theme_void() + 
  labs(title = paste(italic(msp8), "(Asexual)")) + 
  theme(plot.title = element_text(hjust = 0.5, family="Arial", size = 20, face = "bold")) + 
  scale_colour_gradientn(colours=c("#DCDCDC", plasma(30)))

marker_gene_plot_SBP1 <- FeaturePlot(tenx.justwt.integrated, features = "PBANKA-1101300", coord.fixed = TRUE, 
                                     #min.cutoff = "q1", 
                                     dims = c(1,2), reduction = "umap", pt.size = 1, order = TRUE) + 
  theme_void() + 
  labs(title = paste(italic(sbp1), "(Ring)")) + 
  theme(plot.title = element_text(hjust = 0.5, family="Arial", size = 20, face = "bold")) + 
  scale_colour_gradientn(colours=c("#DCDCDC", plasma(30)))

marker_gene_plot_FAMB <- FeaturePlot(tenx.justwt.integrated, features = "PBANKA-0722600", coord.fixed = TRUE, 
                                     #min.cutoff = "q1", 
                                     dims = c(1,2), reduction = "umap", pt.size = 1, order = TRUE) + 
  theme_void() + 
  labs(title = paste(italic(fam-b2),  "(Ring)")) + 
  theme(plot.title = element_text(hjust = 0.5, family="Arial", size = 20, face = "bold")) + 
  scale_colour_gradientn(colours=c("#DCDCDC", plasma(30)))

marker_gene_plot_HSP70 <- FeaturePlot(tenx.justwt.integrated, features = "PBANKA-0711900", coord.fixed = TRUE, 
                                      #min.cutoff = "q1", 
                                      dims = c(1,2), reduction = "umap", pt.size = 1, order = TRUE) + 
  theme_void() + 
  labs(title = paste(italic(HSP70),  "(Reporter)","\n", "PBANKA_0711900")) + 
  theme(plot.title = element_text(hjust = 0.5, family="Arial", size = 20, face = "bold")) + 
  scale_colour_gradientn(colours=c("#DCDCDC", plasma(30)))

##original label:
# labs(title = paste("(CCP2; Female)","\n", "PBANKA_1319500"))

## plot
## cowplot method
marker_gene_plot_all <- plot_grid(marker_gene_plot_FAMB, marker_gene_plot_MSP8, marker_gene_plot_MSP1, marker_gene_plot_AP2G, marker_gene_plot_CCP2, marker_gene_plot_MG1, marker_gene_plot_HSP70, nrow=3)

marker_gene_plot_all

## patchwork method
#marker_gene_plot_FAMB + marker_gene_plot_MSP8 + marker_gene_plot_MSP1 + marker_gene_plot_AP2G + marker_gene_plot_CCP2 + marker_gene_plot_MG1 + marker_gene_plot_HSP70
```

#### Expression - raw - purple
```{r, fig.height = 15, fig.width = 15}

marker_gene_ramp <- colorRampPalette(c("#D3D3D3", "#1D1564"))(50)

marker_gene_plot_CCP2 <- FeaturePlot(tenx.justwt.integrated, features = "PBANKA-1319500", coord.fixed = TRUE, 
                                     #min.cutoff = "q1", 
                                     dims = c(1,2), reduction = "umap", pt.size = 1, order = TRUE) + 
  theme_void() + 
  labs(title = paste("CCP2 (Female)")) + 
  theme(plot.title = element_text(hjust = 0.5, family="Arial", size = 20, face = "bold")) + 
  scale_colour_gradientn(colours=marker_gene_ramp)

marker_gene_plot_MG1 <- FeaturePlot(tenx.justwt.integrated, features = "PBANKA-0416100", coord.fixed = TRUE, 
                                    #min.cutoff = "q1", 
                                    dims = c(1,2), reduction = "umap", pt.size = 1, order = TRUE) + 
  theme_void() + 
  labs(title = paste("MG1 (Male)")) + 
  theme(plot.title = element_text(hjust = 0.5, family="Arial", size = 20, face = "bold")) + 
  scale_colour_gradientn(colours=marker_gene_ramp)

marker_gene_plot_AP2G <- FeaturePlot(tenx.justwt.integrated, features = "PBANKA-1437500", coord.fixed = TRUE, 
                                     #min.cutoff = "q1", 
                                     dims = c(1,2), reduction = "umap", pt.size = 1, order = TRUE) + 
  theme_void() + 
  labs(title = paste("AP2G (Commitment)")) + 
  theme(plot.title = element_text(hjust = 0.5, family="Arial", size = 20, face = "bold")) + 
  scale_colour_gradientn(colours=marker_gene_ramp)

marker_gene_plot_MSP1 <- FeaturePlot(tenx.justwt.integrated, features = "PBANKA-0831000", coord.fixed = TRUE, 
                                     #min.cutoff = "q1", 
                                     dims = c(1,2), reduction = "umap", pt.size = 1, order = TRUE) + 
  theme_void() + 
  labs(title = paste("MSP1 (Schizont)")) + 
  theme(plot.title = element_text(hjust = 0.5, family="Arial", size = 20, face = "bold")) + 
  scale_colour_gradientn(colours=marker_gene_ramp)

marker_gene_plot_MSP8 <- FeaturePlot(tenx.justwt.integrated, features = "PBANKA-1102200", coord.fixed = TRUE, 
                                     #min.cutoff = "q1", 
                                     dims = c(1,2), reduction = "umap", pt.size = 1, order = TRUE) + 
  theme_void() + 
  labs(title = paste("MSP8 (Asexual)")) + 
  theme(plot.title = element_text(hjust = 0.5, family="Arial", size = 20, face = "bold")) + 
  scale_colour_gradientn(colours=marker_gene_ramp)

marker_gene_plot_SBP1 <- FeaturePlot(tenx.justwt.integrated, features = "PBANKA-1101300", coord.fixed = TRUE, 
                                     #min.cutoff = "q1", 
                                     dims = c(1,2), reduction = "umap", pt.size = 1, order = TRUE) + 
  theme_void() + 
  labs(title = paste("SBP1 (Ring)")) + 
  theme(plot.title = element_text(hjust = 0.5, family="Arial", size = 20, face = "bold")) + 
  scale_colour_gradientn(colours=marker_gene_ramp)

marker_gene_plot_FAMB <- FeaturePlot(tenx.justwt.integrated, features = "PBANKA-0722600", coord.fixed = TRUE, 
                                     #min.cutoff = "q1", 
                                     dims = c(1,2), reduction = "umap", pt.size = 1, order = TRUE) + 
  theme_void() + 
  labs(title = paste("Fam-b2 (Ring)")) + 
  theme(plot.title = element_text(hjust = 0.5, family="Arial", size = 20, face = "bold")) + 
  scale_colour_gradientn(colours=marker_gene_ramp)

marker_gene_plot_HSP70 <- FeaturePlot(tenx.justwt.integrated, features = "PBANKA-0711900", coord.fixed = TRUE, 
                                      #min.cutoff = "q1", 
                                      dims = c(1,2), reduction = "umap", pt.size = 1, order = TRUE) + 
  theme_void() + 
  labs(title = paste("(HSP70; Reporter)","\n", "PBANKA_0711900")) + 
  theme(plot.title = element_text(hjust = 0.5, family="Arial", size = 20, face = "bold")) + 
  scale_colour_gradientn(colours=marker_gene_ramp)

## plot
## cowplot method
marker_gene_plot_all <- plot_grid(marker_gene_plot_FAMB, marker_gene_plot_MSP8, marker_gene_plot_MSP1, marker_gene_plot_AP2G, marker_gene_plot_CCP2, marker_gene_plot_MG1, marker_gene_plot_HSP70, nrow=3)

marker_gene_plot_all
```

#### Expression - data - purple

```{r, fig.height = 10, fig.width = 10}
# PBANKA-1418100        GCSKO-17  FD3   
# PBANKA-0102400         GCSKO-2  MD3 
# PBANKA-0716500        GCSKO-19  MD4 
# PBANKA-1435200        GCSKO-20  FD4 
# PBANKA-0902300        GCSKO-13  FD2
# PBANKA-0413400    GCSKO-10_820  MD5
# PBANKA-0828000         GCSKO-3  GD1
# PBANKA-1302700       GCSKO-oom  MD1 
# PBANKA-1447900        GCSKO-29  MD2
# PBANKA-1454800        GCSKO-21  FD1
# PBANKA-1144800        GCSKO-28  FD5


marker_gene_plot_17 <- FeaturePlot(tenx.justwt.integrated, dims = c(1,2), reduction = "umap", features = "PBANKA-1418100", coord.fixed = TRUE,  pt.size = 1, order = TRUE) + 
  theme_void() + 
  labs(title = paste("fd3")) + 
  theme(plot.title = element_text(hjust = 0.5, family="Arial", size = 20, face = "bold")) + 
  scale_color_continuous_sequential(palette = "Purples 2") +
  ## add sex symbols
  annotate("text", x = 3.8, y = 1.5, label = male_symbol, size=7, color="gray") + 
  annotate("text", x = 2, y = 2.8, label = female_symbol, size=7, color="gray")

marker_gene_plot_2 <- FeaturePlot(tenx.justwt.integrated, dims = c(1,2), reduction = "umap", features = "PBANKA-0102400", coord.fixed = TRUE,  pt.size = 1, order = TRUE) + 
  theme_void() + 
  labs(title = paste("md3")) + 
  theme(plot.title = element_text(hjust = 0.5, family="Arial", size = 20, face = "bold")) + 
  scale_color_continuous_sequential(palette = "Purples 2") +
  ## add sex symbols
  annotate("text", x = 3.8, y = 1.5, label = male_symbol, size=7, color="gray") + 
  annotate("text", x = 2, y = 2.8, label = female_symbol, size=7, color="gray")

marker_gene_plot_19 <- FeaturePlot(tenx.justwt.integrated, dims = c(1,2), reduction = "umap", features = "PBANKA-0716500", coord.fixed = TRUE,  pt.size = 1, order = TRUE) + 
  theme_void() + 
  labs(title = paste("md4")) + 
  theme(plot.title = element_text(hjust = 0.5, family="Arial", size = 20, face = "bold")) + 
  scale_color_continuous_sequential(palette = "Purples 2") +
  ## add sex symbols
  annotate("text", x = 3.8, y = 1.5, label = male_symbol, size=7, color="gray") + 
  annotate("text", x = 2, y = 2.8, label = female_symbol, size=7, color="gray")

marker_gene_plot_20 <- FeaturePlot(tenx.justwt.integrated, dims = c(1,2), reduction = "umap", features = "PBANKA-1435200", coord.fixed = TRUE,  pt.size = 1, order = TRUE) + 
  theme_void() + 
  labs(title = paste("fd4")) + 
  theme(plot.title = element_text(hjust = 0.5, family="Arial", size = 20, face = "bold")) + 
  scale_color_continuous_sequential(palette = "Purples 2") +
  ## add sex symbols
  annotate("text", x = 3.8, y = 1.5, label = male_symbol, size=7, color="gray") + 
  annotate("text", x = 2, y = 2.8, label = female_symbol, size=7, color="gray")

marker_gene_plot_13 <- FeaturePlot(tenx.justwt.integrated, dims = c(1,2), reduction = "umap", features = "PBANKA-0902300", coord.fixed = TRUE,  pt.size = 1, order = TRUE) + 
  theme_void() + 
  labs(title = paste("fd2")) + 
  theme(plot.title = element_text(hjust = 0.5, family="Arial", size = 20, face = "bold")) + 
  scale_color_continuous_sequential(palette = "Purples 2") +
  ## add sex symbols
  annotate("text", x = 3.8, y = 1.5, label = male_symbol, size=7, color="gray") + 
  annotate("text", x = 2, y = 2.8, label = female_symbol, size=7, color="gray")

marker_gene_plot_10 <- FeaturePlot(tenx.justwt.integrated, dims = c(1,2), reduction = "umap", features = "PBANKA-0413400", coord.fixed = TRUE,  pt.size = 1, order = TRUE) + 
  theme_void() + 
  labs(title = paste("md5")) + 
  theme(plot.title = element_text(hjust = 0.5, family="Arial", size = 20, face = "bold")) + 
  scale_color_continuous_sequential(palette = "Purples 2") +
  ## add sex symbols
  annotate("text", x = 3.8, y = 1.5, label = male_symbol, size=7, color="gray") + 
  annotate("text", x = 2, y = 2.8, label = female_symbol, size=7, color="gray")

marker_gene_plot_3 <- FeaturePlot(tenx.justwt.integrated, dims = c(1,2), reduction = "umap", features = "PBANKA-0828000", coord.fixed = TRUE,  pt.size = 1, order = TRUE) + 
  theme_void() + 
  labs(title = paste("gd1")) + 
  theme(plot.title = element_text(hjust = 0.5, family="Arial", size = 20, face = "bold")) + 
  scale_color_continuous_sequential(palette = "Purples 2") +
  ## add sex symbols
  annotate("text", x = 3.8, y = 1.5, label = male_symbol, size=7, color="gray") + 
  annotate("text", x = 2, y = 2.8, label = female_symbol, size=7, color="gray")

marker_gene_plot_oom <- FeaturePlot(tenx.justwt.integrated, dims = c(1,2), reduction = "umap", features = "PBANKA-1302700", coord.fixed = TRUE,  pt.size = 1, order = TRUE) + 
  theme_void() + 
  labs(title = paste("md1")) + 
  theme(plot.title = element_text(hjust = 0.5, family="Arial", size = 20, face = "bold")) + 
  scale_color_continuous_sequential(palette = "Purples 2") +
  ## add sex symbols
  annotate("text", x = 3.8, y = 1.5, label = male_symbol, size=7, color="gray") + 
  annotate("text", x = 2, y = 2.8, label = female_symbol, size=7, color="gray")

marker_gene_plot_29 <- FeaturePlot(tenx.justwt.integrated, dims = c(1,2), reduction = "umap", features = "PBANKA-1447900", coord.fixed = TRUE,  pt.size = 1, order = TRUE) + 
  theme_void() + 
  labs(title = paste("md2")) + 
  theme(plot.title = element_text(hjust = 0.5, family="Arial", size = 20, face = "bold")) + 
  scale_color_continuous_sequential(palette = "Purples 2") +
  ## add sex symbols
  annotate("text", x = 3.8, y = 1.5, label = male_symbol, size=7, color="gray") + 
  annotate("text", x = 2, y = 2.8, label = female_symbol, size=7, color="gray")

marker_gene_plot_21 <- FeaturePlot(tenx.justwt.integrated, dims = c(1,2), reduction = "umap", features = "PBANKA-1454800", coord.fixed = TRUE,  pt.size = 1, order = TRUE) + 
  theme_void() + 
  labs(title = paste("fd1")) + 
  theme(plot.title = element_text(hjust = 0.5, family="Arial", size = 20, face = "bold")) + 
  scale_color_continuous_sequential(palette = "Purples 2") +
  ## add sex symbols
  annotate("text", x = 3.8, y = 1.5, label = male_symbol, size=7, color="gray") + 
  annotate("text", x = 2, y = 2.8, label = female_symbol, size=7, color="gray")

##original label:
# labs(title = paste("(CCP2; Female)","\n", "PBANKA_1319500"))

## make composite plot
mutant_expression_composite <- wrap_plots(marker_gene_plot_17 , marker_gene_plot_2 , marker_gene_plot_19 , marker_gene_plot_20 , marker_gene_plot_13 , marker_gene_plot_10 , marker_gene_plot_3 , marker_gene_plot_oom , marker_gene_plot_29 , marker_gene_plot_21 , ncol = 4)
           
## print
mutant_expression_composite
```

# Density Plots

#### Density - purple

```{r}
# PBANKA-1319500 - CCP2 - female - used in 820 line
# PBANKA-0416100 - MG1 - dynenin heavy chain - male - used in 820 line
# PBANKA-1437500 - AP2G - commitment
# PBANKA-0831000 - MSP1 - late asexual
# PBANKA-1102200 - MSP8 - early asexual (from Bozdech paper)
# PBANKA-0711900 - HSP70 - promoter used for GFP and RFP expression in the mutants
# PBANKA-1400400 - FAMB - ring marker - discovered by looking for marker genes in data
# PBANKA-0722600 - Fam-b2 - ring marker - https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5113031/ 

markers_list <- c("PBANKA-0828000", "PBANKA-1302700", "PBANKA-1447900", "PBANKA-0102400", "PBANKA-0716500", "PBANKA-0413400", "PBANKA-1454800", "PBANKA-0902300", "PBANKA-1418100", "PBANKA-1435200")

list_of_density_plots_mutant_genes <- plot_density(tenx.justwt.integrated, markers_list, joint = FALSE, combine = FALSE, dims = c(1,2), pal = "plasma", method = "ks")

## make composite plot
UMAP_composite_mutant_genes <- wrap_plots(list_of_density_plots_mutant_genes[[1]] + 
                                            coord_fixed() + 
                                            theme_void() + 
                                            labs(title = paste("gd1")) + 
                                            scale_colour_gradientn(colours=marker_gene_ramp) + 
                                            guides(colour = guide_colourbar(barwidth = 0.5, title = "")),
                             list_of_density_plots_mutant_genes[[2]] + 
                               coord_fixed() + theme_void() + 
                               labs(title = paste("md1")) + 
                               scale_colour_gradientn(colours=marker_gene_ramp) + 
                               guides(colour = guide_colourbar(barwidth = 0.5, title = "")),
                             list_of_density_plots_mutant_genes[[3]] + coord_fixed() + theme_void() + labs(title = paste("md2")) + scale_colour_gradientn(colours=marker_gene_ramp) + guides(colour = guide_colourbar(barwidth = 0.5, title = "")),
                             list_of_density_plots_mutant_genes[[4]] + coord_fixed() + theme_void() + labs(title = paste("md3")) + scale_colour_gradientn(colours=marker_gene_ramp) + guides(colour = guide_colourbar(barwidth = 0.5, title = "")),
                             list_of_density_plots_mutant_genes[[5]] + coord_fixed() + theme_void() + 
  labs(title = paste("md4")) + scale_colour_gradientn(colours=marker_gene_ramp) + guides(colour = guide_colourbar(barwidth = 0.5, title = "")),
                             list_of_density_plots_mutant_genes[[6]] + coord_fixed() + theme_void() + labs(title = paste("md5")) + scale_colour_gradientn(colours=marker_gene_ramp) + guides(colour = guide_colourbar(barwidth = 0.5, title = "")),
                             list_of_density_plots_mutant_genes[[7]] + coord_fixed() + theme_void() + labs(title = paste("fd1")) + scale_colour_gradientn(colours=marker_gene_ramp) + guides(colour = guide_colourbar(barwidth = 0.5, title = "")),
                             list_of_density_plots_mutant_genes[[8]] + coord_fixed() + theme_void() + labs(title = paste("fd2")) + scale_colour_gradientn(colours=marker_gene_ramp) + guides(colour = guide_colourbar(barwidth = 0.5, title = "")),
                             list_of_density_plots_mutant_genes[[9]] + coord_fixed() + theme_void() + labs(title = paste("fd3")) + scale_colour_gradientn(colours=marker_gene_ramp) + guides(colour = guide_colourbar(barwidth = 0.5, title = "")),
                             list_of_density_plots_mutant_genes[[10]] + coord_fixed() + theme_void() + labs(title = paste("fd4")) + scale_colour_gradientn(colours=marker_gene_ramp) + guides(colour = guide_colourbar(barwidth = 0.5, title = "")),
                             ncol = 4)

UMAP_composite_mutant_genes
```


#### Density -viridis

```{r}
# PBANKA-1319500 - CCP2 - female - used in 820 line
# PBANKA-0416100 - MG1 - dynenin heavy chain - male - used in 820 line
# PBANKA-1437500 - AP2G - commitment
# PBANKA-0831000 - MSP1 - late asexual
# PBANKA-1102200 - MSP8 - early asexual (from Bozdech paper)
# PBANKA-0711900 - HSP70 - promoter used for GFP and RFP expression in the mutants
# PBANKA-1400400 - FAMB - ring marker - discovered by looking for marker genes in data
# PBANKA-0722600 - Fam-b2 - ring marker - https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5113031/ 

markers_list <- c("PBANKA-0828000", "PBANKA-1302700", "PBANKA-1447900", "PBANKA-0102400", "PBANKA-0716500", "PBANKA-0413400", "PBANKA-1454800", "PBANKA-0902300", "PBANKA-1418100", "PBANKA-1435200")

list_of_density_plots_mutant_genes <- plot_density(tenx.justwt.integrated, markers_list, joint = FALSE, combine = FALSE, dims = c(1,2), pal = "plasma", method = "ks")

## make composite plot
UMAP_composite_mutant_genes <- wrap_plots(list_of_density_plots_mutant_genes[[1]] + 
                                            coord_fixed() + 
                                            theme_void() + 
                                            labs(title = paste("gd1")) + 
                                            scale_colour_gradientn(colours=c("#DCDCDC", plasma(30))) + 
                                            guides(colour = guide_colourbar(barwidth = 0.5, title = "")),
                             list_of_density_plots_mutant_genes[[2]] + 
                               coord_fixed() + theme_void() + 
                               labs(title = paste("md1")) + 
                               scale_colour_gradientn(colours=c("#DCDCDC", plasma(30))) + 
                               guides(colour = guide_colourbar(barwidth = 0.5, title = "")),
                             list_of_density_plots_mutant_genes[[3]] + coord_fixed() + theme_void() + labs(title = paste("md2")) + scale_colour_gradientn(colours=c("#DCDCDC", plasma(30))) + guides(colour = guide_colourbar(barwidth = 0.5, title = "")),
                             list_of_density_plots_mutant_genes[[4]] + coord_fixed() + theme_void() + labs(title = paste("md3")) + scale_colour_gradientn(colours=c("#DCDCDC", plasma(30)) )+ guides(colour = guide_colourbar(barwidth = 0.5, title = "")),
                             list_of_density_plots_mutant_genes[[5]] + coord_fixed() + theme_void() + 
  labs(title = paste("md4")) + scale_colour_gradientn(colours=c("#DCDCDC", plasma(30))) + guides(colour = guide_colourbar(barwidth = 0.5, title = "")),
                             list_of_density_plots_mutant_genes[[6]] + coord_fixed() + theme_void() + labs(title = paste("md5")) + scale_colour_gradientn(colours=c("#DCDCDC", plasma(30))) + guides(colour = guide_colourbar(barwidth = 0.5, title = "")),
                             list_of_density_plots_mutant_genes[[7]] + coord_fixed() + theme_void() + labs(title = paste("fd1")) + scale_colour_gradientn(colours=c("#DCDCDC", plasma(30))) + guides(colour = guide_colourbar(barwidth = 0.5, title = "")),
                             list_of_density_plots_mutant_genes[[8]] + coord_fixed() + theme_void() + labs(title = paste("fd2")) + scale_colour_gradientn(colours=c("#DCDCDC", plasma(30))) + guides(colour = guide_colourbar(barwidth = 0.5, title = "")),
                             list_of_density_plots_mutant_genes[[9]] + coord_fixed() + theme_void() + labs(title = paste("fd3")) + scale_colour_gradientn(colours=c("#DCDCDC", plasma(30))) + guides(colour = guide_colourbar(barwidth = 0.5, title = "")),
                             list_of_density_plots_mutant_genes[[10]] + coord_fixed() + theme_void() + labs(title = paste("fd4")) + scale_colour_gradientn(colours=c("#DCDCDC", plasma(30))) + guides(colour = guide_colourbar(barwidth = 0.5, title = "")),
                             ncol = 4)

UMAP_composite_mutant_genes
```

#### Density - viridis
```{r, fig.height = 10, fig.width = 10}
# PBANKA-0828000         GCSKO-3  GD1

# PBANKA-1302700       GCSKO-oom  MD1 
# PBANKA-1447900        GCSKO-29  MD2
# PBANKA-0102400         GCSKO-2  MD3 
# PBANKA-0716500        GCSKO-19  MD4 
# PBANKA-0413400    GCSKO-10_820  MD5

# PBANKA-1454800        GCSKO-21  FD1
# PBANKA-0902300        GCSKO-13  FD2
# PBANKA-1418100        GCSKO-17  FD3   
# PBANKA-1435200        GCSKO-20  FD4 

markers_list <- c("PBANKA-0828000", "PBANKA-1302700", "PBANKA-1447900", "PBANKA-0102400", "PBANKA-0716500", "PBANKA-0413400", "PBANKA-1454800", "PBANKA-0902300", "PBANKA-1418100", "PBANKA-1435200")

list_of_density_plots_mutant_genes <- plot_density(tenx.justwt.integrated, markers_list, joint = FALSE, combine = FALSE, dims = c(1,2), pal = "plasma", method = "ks")

## make composite plot
UMAP_composite_mutant_genes <- wrap_plots(list_of_density_plots_mutant_genes[[1]] + 
                                            coord_fixed() + 
                                            theme_void() + 
                                            labs(title = paste("gd1")) + 
                                            scale_colour_gradientn(colours=c("#DCDCDC", plasma(30))) + 
                                            guides(colour = guide_colourbar(barwidth = 0.5, title = "")),
                             list_of_density_plots_mutant_genes[[2]] + 
                               coord_fixed() + theme_void() + 
                               labs(title = paste("md1")) + 
                               scale_colour_gradientn(colours=c("#DCDCDC", plasma(30))) + 
                               guides(colour = guide_colourbar(barwidth = 0.5, title = "")),
                             list_of_density_plots_mutant_genes[[3]] + coord_fixed() + theme_void() + labs(title = paste("md2")) + scale_colour_gradientn(colours=c("#DCDCDC", plasma(30))) + guides(colour = guide_colourbar(barwidth = 0.5, title = "")),
                             list_of_density_plots_mutant_genes[[4]] + coord_fixed() + theme_void() + labs(title = paste("md3")) + scale_colour_gradientn(colours=c("#DCDCDC", plasma(30)) )+ guides(colour = guide_colourbar(barwidth = 0.5, title = "")),
                             list_of_density_plots_mutant_genes[[5]] + coord_fixed() + theme_void() + 
  labs(title = paste("md4")) + scale_colour_gradientn(colours=c("#DCDCDC", plasma(30))) + guides(colour = guide_colourbar(barwidth = 0.5, title = "")),
                             list_of_density_plots_mutant_genes[[6]] + coord_fixed() + theme_void() + labs(title = paste("md5")) + scale_colour_gradientn(colours=c("#DCDCDC", plasma(30))) + guides(colour = guide_colourbar(barwidth = 0.5, title = "")),
                             list_of_density_plots_mutant_genes[[7]] + coord_fixed() + theme_void() + labs(title = paste("fd1")) + scale_colour_gradientn(colours=c("#DCDCDC", plasma(30))) + guides(colour = guide_colourbar(barwidth = 0.5, title = "")),
                             list_of_density_plots_mutant_genes[[8]] + coord_fixed() + theme_void() + labs(title = paste("fd2")) + scale_colour_gradientn(colours=c("#DCDCDC", plasma(30))) + guides(colour = guide_colourbar(barwidth = 0.5, title = "")),
                             list_of_density_plots_mutant_genes[[9]] + coord_fixed() + theme_void() + labs(title = paste("fd3")) + scale_colour_gradientn(colours=c("#DCDCDC", plasma(30))) + guides(colour = guide_colourbar(barwidth = 0.5, title = "")),
                             list_of_density_plots_mutant_genes[[10]] + coord_fixed() + theme_void() + labs(title = paste("fd4")) + scale_colour_gradientn(colours=c("#DCDCDC", plasma(30))) + guides(colour = guide_colourbar(barwidth = 0.5, title = "")),
                             ncol = 4)

UMAP_composite_mutant_genes

## IF putting in MAIN - list_of_density_plots_mutant_genes[[2]] + coord_fixed() + theme_void() + theme(plot.title = element_text(hjust = 0.5), legend.text = element_text(size = 8), text=element_text(size=9)) + labs(title = paste("md1")) + scale_colour_gradientn(colours=c("#DCDCDC", plasma(30))) + guides(colour = guide_colourbar(barwidth = 0.3, barheight = 4.0, title = ""))
```

```{r, fig.height = 10, fig.width = 10}
# PBANKA-0828000         GCSKO-3  GD1

# PBANKA-1302700       GCSKO-oom  MD1 
# PBANKA-1447900        GCSKO-29  MD2
# PBANKA-0102400         GCSKO-2  MD3 
# PBANKA-0716500        GCSKO-19  MD4 
# PBANKA-0413400    GCSKO-10_820  MD5

# PBANKA-1454800        GCSKO-21  FD1
# PBANKA-0902300        GCSKO-13  FD2
# PBANKA-1418100        GCSKO-17  FD3   
# PBANKA-1435200        GCSKO-20  FD4 

markers_list <- c("PBANKA-0828000", "PBANKA-1302700", "PBANKA-1447900", "PBANKA-0102400", "PBANKA-0716500", "PBANKA-0413400", "PBANKA-1454800", "PBANKA-0902300", "PBANKA-1418100", "PBANKA-1435200")

list_of_density_plots_mutant_genes <- plot_density(tenx.justwt.integrated, markers_list, joint = FALSE, combine = FALSE, dims = c(1,2), pal = "plasma", method = "wkde", slot = 'data')

## make composite plot
UMAP_composite_mutant_genes <- wrap_plots(list_of_density_plots_mutant_genes[[1]] + 
                                            coord_fixed() + 
                                            theme_void() + 
                                            labs(title = paste("gd1")) + 
                                            scale_color_continuous_sequential(palette = "Purples 2") + 
                                            guides(colour = guide_colourbar(barwidth = 0.5, title = "")),
                             list_of_density_plots_mutant_genes[[2]] + 
                               coord_fixed() + theme_void() + 
                               labs(title = paste("md1")) + 
                               scale_color_continuous_sequential(palette = "Purples 2") + 
                               guides(colour = guide_colourbar(barwidth = 0.5, title = "")),
                             list_of_density_plots_mutant_genes[[3]] + 
                               coord_fixed() + 
                               theme_void() + 
                               labs(title = paste("md2")) + scale_color_continuous_sequential(palette = "Purples 2") + 
                               guides(colour = guide_colourbar(barwidth = 0.5, title = "")),
                             list_of_density_plots_mutant_genes[[4]] + coord_fixed() + theme_void() + labs(title = paste("md3")) + scale_color_continuous_sequential(palette = "Purples 2") + guides(colour = guide_colourbar(barwidth = 0.5, title = "")),
                             list_of_density_plots_mutant_genes[[5]] + coord_fixed() + theme_void() + 
  labs(title = paste("md4")) + scale_color_continuous_sequential(palette = "Purples 2") + guides(colour = guide_colourbar(barwidth = 0.5, title = "")),
                             list_of_density_plots_mutant_genes[[6]] + coord_fixed() + theme_void() + labs(title = paste("md5")) + scale_color_continuous_sequential(palette = "Purples 2") + guides(colour = guide_colourbar(barwidth = 0.5, title = "")),
                             list_of_density_plots_mutant_genes[[7]] + coord_fixed() + theme_void() + labs(title = paste("fd1")) + scale_color_continuous_sequential(palette = "Purples 2") + guides(colour = guide_colourbar(barwidth = 0.5, title = "")),
                             list_of_density_plots_mutant_genes[[8]] + coord_fixed() + theme_void() + labs(title = paste("fd2")) + scale_color_continuous_sequential(palette = "Purples 2") + guides(colour = guide_colourbar(barwidth = 0.5, title = "")),
                             list_of_density_plots_mutant_genes[[9]] + coord_fixed() + theme_void() + labs(title = paste("fd3")) + scale_color_continuous_sequential(palette = "Purples 2") + guides(colour = guide_colourbar(barwidth = 0.5, title = "")),
                             list_of_density_plots_mutant_genes[[10]] + coord_fixed() + theme_void() + labs(title = paste("fd4")) + scale_color_continuous_sequential(palette = "Purples 2") + guides(colour = guide_colourbar(barwidth = 0.5, title = "")),
                             ncol = 4)

## old colour scale: scale_colour_gradientn(colours=marker_gene_ramp )

UMAP_composite_mutant_genes
```





